The customer's database (RAC environment: 11.1.0.6) encountered abnormal instance downtime, accompanied by a ORA-07445 error:
The customer's database (RAC environment: 11.1.0.6) encountered abnormal instance downtime, accompanied by a ORA-07445 error:
Symptom:
The customer's database (RAC environment: 11.1.0.6) encountered abnormal instance downtime, accompanied by a ORA-07445 error:
Sun Jun 23 01:00:06 2013
Exception [type: SIGSEGV, Address not mapped to object] [ADDR: 0xF] [PC: 0x755773D, kcbw_get_bh () + 67]
Errors in file/Oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_mman_2015.trc (incident = 298938 ):
ORA-07445: exception encountered: core dump [kcbw_get_bh () + 67] [SIGSEGV] [ADDR: 0xF] [PC: 0x755773D] [Address not mapped to object] []
Incident details in:/oracle/app/11gR1/diag/rdbms/xij/xij1/incident/incdir_298938/xij1_mman_2015_i298938.trc
Sun Jun 23 01:00:07 2013
Trace dumping is refreshing Ming id = [cdmp_20130623010007]
Sun Jun 23 01:00:09 2013
Sweep Incident [298938]: completed
Sun Jun 23 01:00:09 2013
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_pmon_1981.trc:
ORA-00822: MMAN process terminated with error.
PMON (ospid: 1981): terminating the instance due to error 822
Sun Jun 23 01:00:09 2013
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00822: MMAN process terminated with error.
Sun Jun 23 01:00:09 2013
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_m000_22430.trc:
ORA-00822: MMAN process terminated with error.
System state dump is made for local instance
System State dumped to trace file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_diag_1987.trc
Sun Jun 23 01:00:09 2013
ORA-1092: opiodr aborting process unknown ospid (11096_47524616916112)
Sun Jun 23 01:00:09 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092: opiodr aborting process unknown ospid (6317_47213365785744)
Sun Jun 23 01:00:09 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092: opiodr aborting process unknown ospid (28698_47021312551056)
Sun Jun 23 01:00:09 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092: opiodr aborting process unknown ospid (18927_475675000003456)
Sun Jun 23 01:00:10 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:10 2013
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_q004173487.trc:
ORA-00822: MMAN process terminated with error.
ORA-1092: opidrv aborting process Q001 ospid (3487_47252506410128)
Sun Jun 23 01:00:11 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:11 2013
License high water mark = 510
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_m000_22430.trc:
ORA-00822: MMAN process terminated with error.
ORA-00822: MMAN process terminated with error.
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00449: background process 'lgwr' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error.
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00449: background process 'lgwr' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error.
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00822: MMAN process terminated with error.
ORA-06512: at "WKSYS. WK_JOB", line 442
ORA-00449: background process 'mmon' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error.
ORA-06512: at line 1
ORA-1092: opidrv aborting process J000 ospid (22268_47357930925200)
Sun Jun 23 01:00:20 2013
Instance terminated by PMON, pid = 1981
Sun Jun 23 01:00:21 2013
USER (ospid: 22527): terminating the instance
Instance terminated by USER, pid = 22527
Sun Jun 23 01:00:26 2013
Starting ORACLE instance (normal)
Analysis:
Ora-07445 is usually caused by Oracle's own BUG,
First, use IPS to collect error messages in alert. (For how to use IPS, see my other article "simple use of IPS".)
After searching for metalink, we found that the customer's problem is similar to the BUG described in the following three notes:
ORA-7445 (kcbw_get_bh) [ID 1341402.1]
Bug 9728912 [https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top? Rptno = 9728912]-PMON terminates instance due to ORA-7445 [kcbw_numperchunk]/ORA-7445 [kcbw_get_bh] [ID 9728912.8]
Instance Crashed On ORA-7445 kcbw_numperchunk [ID 1364264.1]
However, according to the Note, we can see that the related BUG has been fixed in 11.1.0.6.
Check the other critical error messages in the customer database:
Node1:
Adrci> show problem
ADR Home =/oracle/app/11gR1/diag/rdbms/xij/xij1:
**************************************** *********************************
PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME
-------------------------------------------------------------------------------------------------------------------------------------------
5 ORA 7445 [kcbw_get_bh () + 67] 298938 01:00:06. 373716 + 08: 00
11 ORA 600 276161 18:12:12. 709933 + 08: 00
10 ORA 600 [729] 276160 18:09:27. 857128 + 08: 00
7 ORA 7445 [kgghash () + 367] 253234 15:27:04. 349337 + 08: 00
9 ORA 7445 [kksMapCursor () + 323] 256538 09:54:58. 684956 + 08: 00
8 ORA 7445 [qkabxo () + 22] 251194 22:03:37. 715416 + 08: 00
2 ORA 600 [kghfrh: ds] 238818 11:35:23. 755034 + 08: 00
6 ORA 7445 [eoa_pm_push () + 31] 239218 11:24:42. 835685 + 08: 00
3 ORA 7445 [ioei_get_method_counts () + 39] 71129 11:17:39. 735719 + 08: 00
4 ORA 7445 [jol_calculate_transitive_interface_set () + 1165] 74233 11:05:51. 570021 + 08: 00
1 ORA 600 [kghfru: ds] 6369 17:35:55. 001585 + 08: 00
11 rows fetched
Node2:
[Oracle @ XIJ02 ~] $ Adrci
ADRCI: Release 11.1.0.6.0-Beta on Mon Jun 24 14:59:37 2013