主要遇到了如下幾個問題:
1. mount 發現控制檔案異常,通過替換,用pfile mount成功,這個不說了.
2. open報了一個如下的錯誤:
Fri Jul 04 20:03:23 2014
alter database open
Beginning crash recovery of 1 threads
parallel recovery started with 15 processes
Started redo scan
Completed redo scan
read 229 KB redo, 0 data blocks need recovery
Errors in file d:\app\administrator\diag\rdbms\yunhaoorcl\yunhaoorcl\trace\yunhaoorcl_ora_3416.trc (incident=160589):
ORA-00600: 內部錯誤碼, 參數: [kcratr_nab_less_than_odr], [1], [11783], [8], [1181], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\yunhaoorcl\yunhaoorcl\incident\incdir_160589\yunhaoorcl_ora_3416_i160589.trc
Aborting crash recovery due to error 600
Errors in file d:\app\administrator\diag\rdbms\yunhaoorcl\yunhaoorcl\trace\yunhaoorcl_ora_3416.trc:
ORA-00600: 內部錯誤碼, 參數: [kcratr_nab_less_than_odr], [1], [11783], [8], [1181], [], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\yunhaoorcl\yunhaoorcl\trace\yunhaoorcl_ora_3416.trc:
ORA-00600: 內部錯誤碼, 參數: [kcratr_nab_less_than_odr], [1], [11783], [8], [1181], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...
Fri Jul 04 20:03:25 2014
Trace dumping is performing id=[cdmp_20140704200325]
Fri Jul 04 20:03:27 2014
Sweep [inc][160589]: completed
Sweep [inc2][160589]: completed
對於這個,比較少見,猜測可能是instance recovery的時候出現問題了。嘗試手工recover database:
SQL> recover database;
完成介質恢複。
SQL> archive log list;
資料庫記錄模式 非存檔模式
自動封存 禁用
存檔終點 USE_DB_RECOVERY_FILE_DEST
最早的聯機日誌序列 11781
當前日誌序列 11783
SQL> alter database open;
alter database open
*
第 1 行出現錯誤:
ORA-00600: 內部錯誤碼, 參數: [kcratr_nab_less_than_odr], [1], [11783], [8],
[1181], [], [], [], [], [], [], []
手工recover發現不行,看alert log報了一個error,看下對應的trace如下:
*** 2014-07-04 20:03:23.792
Successfully allocated 15 recovery slaves
Using 10 overflow buffers per recovery slave
Thread 1 checkpoint: logseq 11782, block 2, scn 230294634
cache-low rba: logseq 11782, block 52177
on-disk rba: logseq 11783, block 1181, scn 230308328
start recovery at logseq 11782, block 52177, scn 0
*** 2014-07-04 20:03:24.058
Started writing zeroblks thread 1 seq 11783 blocks 8-15
*** 2014-07-04 20:03:24.058
Completed writing zeroblks thread 1 seq 11783
==== Redo read statistics for thread 1 ====
Total physical reads (from disk and memory): 4322Kb
-- Redo read_disk statistics --
Read rate (ASYNC): 229Kb in 0.11s => 2.04 Mb/sec
Longest record: 3Kb, moves: 0/269 (0%)
Change moves: 6/101 (5%), moved: 0Mb
Longest LWN: 5Kb, moves: 0/115 (0%), moved: 0Mb
Last redo scn: 0x0000.0dba37ae (230307758)
----------------------------------------------
----- Recovery Hash Table Statistics ---------
Hash table buckets = 65536
Longest hash chain = 0
Average hash chain = 0/0 = 0.0
Max compares per lookup = 0
Avg compares per lookup = 0/0 = 0.0
----------------------------------------------
WARNING! Crash recovery of thread 1 seq 11783 is
ending at redo block 8 but should not have ended before
redo block 1181
我們來仔細觀察一下這個instance recovery的資訊,得到如下的資訊:
Thread 1 checkpoint: logseq 11782, block 2, scn 230294634
cache-low rba: logseq 11782, block 52177
on-disk rba: logseq 11783, block 1181, scn 230308328
start recovery at logseq 11782, block 52177, scn 0
線程檢查點: logseq 11782, block 2, scn 230294634
low cache rba: logseq 11782, block 52177
on disk rba: logseq 11783, block 1181, scn 230308328
可以看到執行個體恢複的起點是low cache rba(實際上oracle會比較線程檢查點和low cache rba,選擇其中的較大者作為執行個體恢複的啟點).
最後我們再來仔細分析下這個錯誤:
ORA-00600: 內部錯誤碼, 參數: [kcratr_nab_less_than_odr], [1], [11783], [8],[1181], [], [], [], [], [], [], []
kcratr_nab_less_than_odr: 大膽猜測應該在進行比較某個值
[1] : 應該是指的thread number
[11783]: on disk rba的logseq 值
[8]: 這裡未知
[1181]: on disk rba的block號
最後搜了一下MOS,Oracle 給出的解釋如下,關於這個ora-00600錯誤:
This caused a lost Write into the Online RedoLogs and so Instance Recovery is not possible and raising the ORA-600.
通俗一點講,即online log的寫丟失導致執行個體恢複無法恢複到指定的點,進而拋出這個ora-00600錯誤。
對於on disk rba,這個是oracle instance recovery必須要達到的值,如果無法恢複到該值,那麼將會出現異常,類似這裡的問題.
既然明白了這一點,那麼恢複就很容易了,首先重建下controlfile:
SQL> CREATE CONTROLFILE REUSE DATABASE "YUNHAOOR" NORESETLOGS NOARCHIVELOG
2 MAXLOGFILES 16
3 MAXLOGMEMBERS 3
4 MAXDATAFILES 100
5 MAXINSTANCES 8
6 MAXLOGHISTORY 1168
7 LOGFILE
8 GROUP 1 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\REDO01.LOG' SIZE 50M BLOCKSIZE 512,
9 GROUP 2 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\REDO02.LOG' SIZE 50M BLOCKSIZE 512,
10 GROUP 3 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\REDO03.LOG' SIZE 50M BLOCKSIZE 512
11 -- STANDBY LOGFILE
12 DATAFILE
13 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\SYSTEM01.DBF',
14 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\SYSAUX01.DBF',
15 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\UNDOTBS01.DBF',
16 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\USERS01.DBF',
17 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\FENG_SPACE.DBF',
18 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\SYSTEM02.DBF',
19 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\SYSTEM03.DBF',
20 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\USERS2',
21 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\SYSAUX1',
22 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\SYSTEM1'
23 CHARACTER SET ZHS16GBK
24 ;
控制檔案已建立。
SQL> recover database;
完成介質恢複。
最後開啟資料庫,發現又報錯了,不過還好,這是一個只要是DBA知道怎麼解決的錯了,如下:
SQL> alter database open;
alter database open
*
第 1 行出現錯誤:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
進程 ID: 6760
會話 ID: 534 序號: 1
SQL>
實際上查看alert log 還看到了ora-00600 [4193]錯誤。 這個處理方法一樣,不累述。
(incident=171742):
ORA-00600: internal error code, arguments: [4193], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\yunhaoorcl\yunhaoorcl\incident\incdir_171742\yunhaoorcl_smon_3884_i171742.trc
No Resource Manager plan active
Errors in file d:\app\administrator\diag\rdbms\yunhaoorcl\yunhaoorcl\trace\yunhaoorcl_ora_6760.trc (incident=171790):
ORA-00600: 內部錯誤碼, 參數: [4194], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\yunhaoorcl\yunhaoorcl\incident\incdir_171790\yunhaoorcl_ora_6760_i171790.trc
Fri Jul 04 20:12:34 2014
Trace dumping is performing id=[cdmp_20140704201234]
Trace dumping is performing id=[cdmp_20140704201235]
Doing block recovery for file 3 block 217
Resuming block recovery (PMON) for file 3 block 217
Block recovery from logseq 11784, block 63 to scn 230347995
Recovery of Online Redo Log: Thread 1 Group 3 Seq 11784 Reading mem 0
Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\REDO03.LOG
Block recovery stopped at EOT rba 11784.67.16
Block recovery completed at rba 11784.67.16, scn 0.230347992
Doing block recovery for file 3 block 144
Resuming block recovery (PMON) for file 3 block 144
Block recovery from logseq 11784, block 63 to scn 230347989
Recovery of Online Redo Log: Thread 1 Group 3 Seq 11784 Reading mem 0
Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\REDO03.LOG
Block recovery completed at rba 11784.65.16, scn 0.230347991
Errors in file d:\app\administrator\diag\rdbms\yunhaoorcl\yunhaoorcl\trace\yunhaoorcl_smon_3884.trc:
ORA-01595: error freeing extent (2) of rollback segment (2))
ORA-00600: internal error code, arguments: [4193], [], [], [], [], [], [], [], [], [], [], []
對於ora-00600 4193/4194錯誤,通過pfile指定undo_management=maual,然後啟庫重建undo即可,如下:
SQL> create undo tablespace undotbs2 datafile 'D:\APP\ADMINISTRATOR\ORADATA\YUNHAOORCL\undotbs2_01.dbf
2 size 4096m;
資料表空間已建立。
SQL> shutdown immediate
資料庫已經關閉。
已經卸載資料庫。
ORACLE 常式已經關閉。
SQL> startup mount pfile='D:\1.ora'
ORACLE 常式已經啟動。
Total System Global Area 9620525056 bytes
Fixed Size 2183872 bytes
Variable Size 4395633984 bytes
Database Buffers 5200936960 bytes
Redo Buffers 21770240 bytes
資料庫裝載完畢。
SQL> alter database open;
資料庫已更改。
SQL> show parameter undo
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
undo_management string AUTO
undo_retention integer 900
undo_tablespace string UNDOTBS2
SQL> drop tablespace undotbs1 including contents and datafiles;
資料表空間已刪除。
備忘:對於重建controlfile後,記得添加tempfile,不要給人接埋地雷!
ORA-600 [kcratr_nab_less_than_odr] during Instance Recovery after Database Crash (文檔 ID 1299564.1)
Alter database open fails with ORA-00600 kcratr_nab_less_than_odr (文檔 ID 1296264.1)