The customer's database is 10.2.0.4, and an instance suddenly crashes. The customer wants us to help analyze the causes of downtime. For this database suddenly Crash problem, we will first look at the database's Alert Log, we can see that before the Crash, the SMON process reported the ORA-00600 [15709] error, immediately after the database outputs a message "Fatalin
The customer's database is 10.2.0.4, and an instance suddenly crashes. The customer wants us to help analyze the causes of downtime. For this database suddenly Crash problem, we will first look at the database's Alert Log, we can see that before the Crash, the SMON process reported the ORA-00600 [15709] error, immediately after the database outputs a message "Fatal in
The customer's database is 10.2.0.4, and an instance suddenly crashes. The customer wants us to help analyze the causes of downtime. For this database suddenly Crash problem, we will first look at the database's Alert Log, we can see that before the Crash, the SMON process reported the ORA-00600 [15709] error, the database output a message "Fatal internal error happened while SMON was doing active transaction recovery. that is to say, an exception occurs when SMON recovers the active transaction. The database instance is down. The log output is as follows:
Fri Sep 26 10:53:35 2014Errors in file /oracle/app/oracle/admin/wxyydb/bdump/wxyydb_smon_28997.trc:ORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []ORA-30319: Message 30319 not found; product=RDBMS; facility=ORAFri Sep 26 10:53:55 2014Fatal internal error happened while SMON was doing active transaction recovery.Fri Sep 26 10:53:55 2014Errors in file /oracle/app/oracle/admin/wxyydb/bdump/wxyydb_smon_28997.trc:ORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []ORA-30319: Message 30319 not found; product=RDBMS; facility=ORASMON: terminating instance due to error 474Termination issued to instance processes. Waiting for the processes to exitFri Sep 26 10:54:05 2014Instance termination failed to kill one or more processesInstance terminated by SMON, pid = 28997
Let's analyze the information in the wxyydb_smon_28997.trc file. We can see that the database's SMON process has been trying to perform parallel recovery transactions. In the process of recovery encountered a ORA-00600 error, the final underlying code exception triggered the database downtime.
*** 2014-09-26 10:10:36.236Parallel Transaction recovery caught error 30319 *** 2014-09-26 10:15:10.643Parallel Transaction recovery caught exception 30319*** 2014-09-26 10:15:21.816Parallel Transaction recovery caught error 30319 *** 2014-09-26 10:19:51.707Parallel Transaction recovery caught exception 30319*** 2014-09-26 10:53:35.830ksedmp: internal or fatal errorORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []ORA-30319: Message 30319 not found; product=RDBMS; facility=ORA----- Call Stack Trace -----calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ----------------------------ksedst()+64 call ksedst1() 000000000 ? 000000001 ?ksedmp()+2176 call ksedst() 000000000 ? C000000000000C9F ? 4000000004057F40 ? 000000000 ? 000000000 ? 000000000 ?ksfdmp()+48 call ksedmp() 000000003 ?kgeriv()+336 call ksfdmp() C000000000000695 ? 000000003 ? 40000000095185E0 ? 00000EC33 ? 000000000 ? 000000000 ? 000000000 ? 000000000 ?kgeasi()+416 call kgeriv() 6000000000031770 ? 6000000000032828 ? 4000000001A504E0 ? 000000002 ? 9FFFFFFFFFFFA138 ?$cold_kxfpqsrls()+1 call kgeasi() 6000000000031770 ?168 9FFFFFFFFD3D2290 ? 000003D5D ? 000000002 ? 000000002 ? 0000003E7 ? 000003D5D ? 9FFFFFFFFD3D22A0 ?kxfpqrsod()+1104 call $cold_kxfpqsrls() C0000004FDF7A838 ? C0000004FDF74430 ? 000000004 ? 9FFFFFFFFFFFA200 ? C0000000000011AB ? 4000000003AA1250 ? 00000EDF5 ? 000000001 ?kxfpdelqrefs()+640 call kxfpqrsod() C0000004FDF74430 ? 000000001 ? 60000000000B6300 ? C000000000000694 ? 4000000003DD14F0 ? 00000EE2D ? 60000000000C6708 ?kxfpqsod_qc_sod()+2 call kxfpdelqrefs() 00000003E ? 000000001 ?016 60000000000B6300 ? C000000000001028 ? 40000000025DE5A0 ? 4000000001B1A110 ? 60000000000C2D04 ? 60000000000C2E90 ?kxfpqsod()+816 call kxfpqsod_qc_sod() 000000010 ? 000000001 ? 9FFFFFFFFFFFA260 ? 60000000000B6300 ? 9FFFFFFFFFFFA7F0 ? C000000000001028 ? 40000000025DF810 ? 00000EE65 ?ktprdestroy()+208 call kxfpqsod() C0000004FDF7A838 ? 000000001 ? 9FFFFFFFFFFFA810 ? 60000000000B6300 ? 9FFFFFFFFFFFAD90 ?ktprbeg()+8272 call ktprdestroy() C000000000001026 ? 40000000025615B0 ? 000006E61 ? 000000000 ? 4000000001052E40 ? 000000000 ?ktmmon()+10096 call ktprbeg() 9FFFFFFFFFFFBE70 ? 9FFFFFFFFFFFADA0 ? 60000000000B6300 ? 40000000028B75A0 ? 00000EF21 ? 9FFFFFFFFFFFADD8 ? 9FFFFFFFFFFFADE0 ?ktmSmonMain()+64 call ktmmon() 9FFFFFFFFFFFD140 ?ksbrdp()+2816 call ktmSmonMain() C000000100E1CA60 ? C000000000000FA5 ? 000007361 ? 4000000003B5AE10 ? C000000000000205 ? 400000000409DCD0 ?opirip()+1136 call ksbrdp() 9FFFFFFFFFFFD150 ? 60000000000B6300 ? 9FFFFFFFFFFFDC90 ? 4000000002863EF0 ? 000004861 ? C000000000000B1D ? 60000000000318F0 ?$cold_opidrv()+1408 call opirip() 9FFFFFFFFFFFEA70 ? 000000004 ? 9FFFFFFFFFFFF090 ? 9FFFFFFFFFFFDCA0 ? 60000000000B6300 ? C000000000000DA1 ?sou2o()+336 call $cold_opidrv() 000000032 ? 9FFFFFFFFFFFF090 ? 60000000000C2C78 ?$cold_opimai_real() call sou2o() 9FFFFFFFFFFFF0B0 ?+640 000000032 ? 000000004 ? 9FFFFFFFFFFFF090 ?main()+368 call $cold_opimai_real() 000000003 ? 000000000 ?main_opd_entry()+80 call main() 000000003 ? 9FFFFFFFFFFFF598 ? 60000000000B6300 ? C000000000000004 ?
According to ORA-00600 [15709], we found a document on Oracle Support, SMON may fail with ORA-00600 [15709] Errors Crashing the Instance (Document ID 736348.1 ), the error message in this document is similar to the information we have reported. This document lists the stack errors: kxfpqsrls <-kxfpqrsod <-kxfpdelqrefs <-encoding <-kxfpqsod <-ktprdestroy <-ktprbe <-ktmmon. We can see from the Trace of SMON that the stack content basically matches this. Therefore, this problem is caused by a bug 695472 hit during the recovery process. If you install this patch, there are still similar problems, it is likely that another similar bug 9233544 has been encountered, and Oracle has many bugs.
Bug 695472Versions 9.2.0.8 and 10.2.0.4 are affected, and they are fixed on 10.2.0.4.2, 10.2.0.5, 11.1.0.7, and 11.2.0.1. The solution to bug 695472 is:
1. Use the following workaround
Set fast_start_parallel_rollback = false and recovery_parallelism = 0
OR
2. Apply one-off < >, If available for your platform/version here.
OR
3. Upgrade to fixed release 10.2.0.5, 11.1.0.7 or 11.2.0.1.
Bug 9233544The three versions 10.2.0.4, 11.1.0.7, and 11.2.0.1 will be affected and fixed on 11.2.0.3 and 12.1. The solution to the bug 9233544 is as follows:
1. Apply patchset 11.2.0.3, in which Bug: 9233544 is fixed.
OR
2. Check if one-off Patch: 9233544 is available for your release and platform here.
We carefully checked the system patch and found that the system has installed patch 6954722, which proves to be affected by bug 9233544. Either upgrade to version 11.2.0.3 or install a separate patch 9233544. The upgrade of 11.2.0.3 is too big. I told the customer to consider installing a small patch.
ORA-00600: internal error code, arguments: [15709], thanks for sharing.