OracleBUG causes instance downtime: ORA-07445

Source: Internet
Author: User

Symptom:
The customer's database (RAC environment: 11.1.0.6) encountered abnormal instance downtime, accompanied by a ORA-07445 error:
Sun Jun 23 01:00:06 2013
Exception [type: SIGSEGV, Address not mapped to object] [ADDR: 0xF] [PC: 0x755773D, kcbw_get_bh () + 67]
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_mman_2015.trc (incident = 298938 ):
ORA-07445: exception encountered: core dump [kcbw_get_bh () + 67] [SIGSEGV] [ADDR: 0xF] [PC: 0x755773D] [Address not mapped to object] []
Incident details in:/oracle/app/11gR1/diag/rdbms/xij/xij1/incident/incdir_298938/xij1_mman_2015_i298938.trc
Sun Jun 23 01:00:07 2013
Trace dumping is refreshing Ming id = [cdmp_20130623010007]
Sun Jun 23 01:00:09 2013
Sweep Incident [298938]: completed
Sun Jun 23 01:00:09 2013
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_pmon_1981.trc:
ORA-00822: MMAN process terminated with error.
PMON (ospid: 1981): terminating the instance due to error 822
Sun Jun 23 01:00:09 2013
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00822: MMAN process terminated with error.
Sun Jun 23 01:00:09 2013
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_m000_22430.trc:
ORA-00822: MMAN process terminated with error.
System state dump is made for local instance
System State dumped to trace file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_diag_1987.trc
Sun Jun 23 01:00:09 2013
ORA-1092: opiodr aborting process unknown ospid (11096_47524616916112)
Sun Jun 23 01:00:09 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092: opiodr aborting process unknown ospid (6317_47213365785744)
Sun Jun 23 01:00:09 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092: opiodr aborting process unknown ospid (28698_47021312551056)
Sun Jun 23 01:00:09 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092: opiodr aborting process unknown ospid (18927_475675000003456)
Sun Jun 23 01:00:10 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:10 2013
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_q004173487.trc:
ORA-00822: MMAN process terminated with error.
ORA-1092: opidrv aborting process Q001 ospid (3487_47252506410128)
Sun Jun 23 01:00:11 2013
ORA-1092: opitsk aborting process
Sun Jun 23 01:00:11 2013
License high water mark = 510
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_m000_22430.trc:
ORA-00822: MMAN process terminated with error.
ORA-00822: MMAN process terminated with error.
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00449: background process 'lgwr' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error.
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00449: background process 'lgwr' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error.
Errors in file/oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00822: MMAN process terminated with error.
ORA-06512: at "WKSYS. WK_JOB", line 442
ORA-00449: background process 'mmon' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error.
ORA-06512: at line 1
ORA-1092: opidrv aborting process J000 ospid (22268_47357930925200)
Sun Jun 23 01:00:20 2013
Instance terminated by PMON, pid = 1981
Sun Jun 23 01:00:21 2013
USER (ospid: 22527): terminating the instance
Instance terminated by USER, pid = 22527
Sun Jun 23 01:00:26 2013
Starting ORACLE instance (normal)

Analysis:
Ora-07445 is usually caused by Oracle's own BUG,
First, use IPS to collect error messages in alert. (For how to use IPS, see my other article "simple use of IPS".)
After searching for metalink, we found that the customer's problem is similar to the BUG described in the following three notes:
ORA-7445 (kcbw_get_bh) [ID 1341402.1]
Bug 9728912 [https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top? Rptno = 9728912]-PMON terminates instance due to ORA-7445 [kcbw_numperchunk]/ORA-7445 [kcbw_get_bh] [ID 9728912.8]
Instance Crashed On ORA-7445 kcbw_numperchunk [ID 1364264.1]
However, according to the Note, we can see that the related BUG has been fixed in 11.1.0.6.
Check the other critical error messages in the customer database:
Node1:
Adrci> show problem

ADR Home =/oracle/app/11gR1/diag/rdbms/xij/xij1:
**************************************** *********************************
PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME
-------------------------------------------------------------------------------------------------------------------------------------------
5 ORA 7445 [kcbw_get_bh () + 67] 298938 01:00:06. 373716 + 08: 00
11 ORA 600 276161 18:12:12. 709933 + 08: 00
10 ORA 600 [729] 276160 18:09:27. 857128 + 08: 00
7 ORA 7445 [kgghash () + 367] 253234 15:27:04. 349337 + 08: 00
9 ORA 7445 [kksMapCursor () + 323] 256538 09:54:58. 684956 + 08: 00
8 ORA 7445 [qkabxo () + 22] 251194 22:03:37. 715416 + 08: 00
2 ORA 600 [kghfrh: ds] 238818 11:35:23. 755034 + 08: 00
6 ORA 7445 [eoa_pm_push () + 31] 239218 11:24:42. 835685 + 08: 00
3 ORA 7445 [ioei_get_method_counts () + 39] 71129 11:17:39. 735719 + 08: 00
4 ORA 7445 [jol_calculate_transitive_interface_set () + 1165] 74233 11:05:51. 570021 + 08: 00
1 ORA 600 [kghfru: ds] 6369 17:35:55. 001585 + 08: 00
11 rows fetched
Node2:
[Oracle @ XIJ02 ~] $ Adrci

ADRCI: Release 11.1.0.6.0-Beta on Mon Jun 24 14:59:37 2013

Copyright (c) 1982,200 7, Oracle. All rights reserved.
ADR base = "/oracle/app/11gR1"
Adrci>
Adrci>
Adrci> set homepath diag/rdbms/xij/xij2
Adrci>
Adrci> show problem
ADR Home =/oracle/app/11gR1/diag/rdbms/xij/xij2:
**************************************** *********************************
PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME
-------------------------------------------------------------------------------------------------------------------------------------------
1 ORA 7445 [kgghash () + 367] 209965 23:34:39. 333982 + 08: 00
2 ORA 7445 [kksMapCursor () + 323] 190129 09:54:56. 121652 + 08: 00
2 rows fetched
Adrci>
Solution:
A total of 13 database faults suspected of being caused by bugs were found on the customer's two nodes. In general, Oracle 11.1.0.6 is not a very stable version and there are various bugs,
Oracle fixed most bugs found in 11.1.0.6 in 11.1.0.7, which is much more stable. Therefore, we recommend that you upgrade the database to 11.1.0.7 or 11.2.0.3.



Appendix:
(Triage Tool 3.01, routed by file analysis ):
Failing Function: kcbw_get_bh
Route To: buffer cache: MANAGEABILITY
Error Argument: [kcbw_get_bh]
Type of Error: ORA-07445
File Name: xijw.mman_2015_i298938.trc
Comment: Routed by Error Argument, Conventional routing
DB Version: 11.1.0.6.0
Platform: Linux CPU: x86_64
OS Version: 2.6.18-194. el5
Stack Trace: kcbw_get_bh kcbw_get_first_buffer kcbw_next_free logs kmgs_process_request_immediate kmgs_process_request kmgsdrv ksbabs ksbrdp opirip





Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.