oracle資料庫ORA-00600[kjhn_post_ha_alert0-862]原因分析

來源:互聯網
上載者:User


資料庫版本和平台資訊

資料庫版本為10.2.0.1版本,而且是32位的win 2003 sp2之上
ORACLE V10.2.0.1.0 - Production vsnsta=0
vsnsql=14 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
Windows Server 2003 Version V5.2 Service Pack 2
CPU                 : 2 - type 586, 1 Physical Cores
Process Affinity    : 0x00000000
Memory (Avail/Total): Ph:2608M/3990M, Ph+PgF:4511M/5871M, VA:1242M/2047M
Instance name: orcl
資料庫報大量ORA-600[kjhn_post_ha_alert0-862]錯誤
資料庫的mmon進程報大量ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []錯誤
Wed Jun 03 21:50:40 2015
Restarting dead background process MMON
MMON started with pid=11, OS id=3804
Wed Jun 03 21:50:43 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
 
Wed Jun 03 21:50:49 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
 
Wed Jun 03 21:55:44 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
 
Wed Jun 03 21:55:49 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
 
Wed Jun 03 22:00:40 2015
Thread 1 advanced to log sequence 476
  Current log# 1 seq# 476 mem# 0: E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO01.LOG
Wed Jun 03 22:00:44 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
查詢對應trace檔案發現
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [] , [], [], [], []
Current SQL statement for this session:
BEGIN :success := dbms_ha_alerts_prvt.check_ha_resources; END;
人工執行該過程
SQL> var success varchar2
SQL> begin
  2  :success := sys.dbms_ha_alerts_prvt.check_ha_resources;
  3  end;
  4  /
 
PL/SQL procedure successfully completed.
 
SQL> print success
 
SUCCESS                                                                      
  
--------------------------------                                             
  
N     
通過查詢相關資料得到如下說明
@ This check is triggered with FAN enabled at this instance and it seems to be
@ associated with a startup action. From the procedure itself which is called
@ this is a run-once MMON (startup) action which supports instance down
@ notification reliability. It does the folowing a) registers the current
@ instance incarnation in recent_resource_incarnations$ if it's not already
@ there b) deletes recent_resource_incarnations$ records that don't apply to
@ this database. They may, e.g., have been copied from seed db or from a former
@ DataGuard primary c) scans recent_resource_incarnations$ for instance
@ incarnations that are no longer alive, and submits instance down alerts for
@ them . If all is good then return 'Y' else 'N' (or error) if there is a
@ failure. That failure is to get back to MMON, so that it may retry this
@ action later. In the local instance I get a 'Y' but in the customer's system
@ it fails with a 'N' which seems related to the ORA-600 assert.
 
@ This function is kjhn_post_ha_alert0() which is internal and does the real work of
@ posting HA alerts. It is used by both kjhn_post_ha_alert and
@ kjn_post_ha_alert_plsql. Its parameters are basically the same as those of
@ kjhn_post_ha_alert,other than the fact that it uses individual parameters
@ rather than the more easily extensible structure. Also the parameters passed
@ to it are the instance_name and the host_name which is the kernelized
@ implementation for posting HA alerts. Without actually having the arguments
@ the guess is that either the host_name or the instance_name raised in the
@ assert is null which triggered it.
mmon進程嘗試調用相關程式,然後無法得出正確值,返回N,然後會一直
嘗試,如果不能得到返回Y,就會一直報ORA-600,錯誤.通過上述的三種情況來說,都和recent_resource_incarnations$表有關係.
該故障原因是由於:mmon在調用kjhn_post_ha_alert0函數在執行的時候,如果發現參數host_name或者instance_name為null,就會報該錯誤出來.

處理方法

This problem has been documented as Bug 5173066 REPEATED ORA-600 [KJHN_POST_HA_ALERT0-862] FROM MMON PROCESS.
The bug is fixed in 11.1.0.6. A workaround is available for the problem.
該bug在11.1.0.6中得以修複
To implement the workaround, please execute the following steps as the SYS user:
 
1. Collect the following information and spool it to a file for your records.
 
a. output of select * from v$instance
b. show parameter instance_name
c. set pages 1000
d. select * from recent_resource_incarnations$
 
2. Create a backup table of recent_resource_incarnations$.
 
SQL> create table recent_resource_inc$bk as select * from recent_resource_incarnations$;
 
 
3. Truncate recent_resource_incarnations$. Be sure to do this while the instance is up and running.
    Do not issue this statement if a shutdown is pending.
 
SQL> truncate table recent_resource_incarnations$;
 
 
4. Perform a clean shutdown, followed by a startup.
具體參考:
ORA-600 [kjhn_post_ha_alert0-862] Continuously Repeated in the Alert Log (Doc ID 401640.1)
Bug 5173066 : REPEATED ORA-600 [KJHN_POST_HA_ALERT0-862] FROM MMON PROCESS

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.