ASM單一實例下CRS-4124,CRS-4000錯誤處理

來源:互聯網
上載者:User

ASM單一實例下CRS-4124,CRS-4000錯誤處理

安裝一下GI,由於自己的筆記本資源有限,安裝了Oracle11g GI,以便自己能學習ASM。安裝完成之後一切都很正常。
但是今天啟動以後發現報錯如下:
[root@myrac1 ~]# su - grid
[grid@myrac1 ~]$ crsctl start has

CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.

查看ohasd.log日誌
2014-02-19 18:02:42.143: [UiServer][2762939248] processMessage called
2014-02-19 18:02:42.144: [UiServer][2762939248] Sending message to PE. ctx= 0xa8be6268
2014-02-19 18:02:42.144: [UiServer][2762939248] Sending command to PE: 51
2014-02-19 18:02:42.144: [  CRSPE][2767141744] Processing PE command id=158. Description: [Stat Resource : 0xb1df7aa0]
2014-02-19 18:02:42.392: [  CRSPE][2767141744] PE Command [ Stat Resource : 0xb1df7aa0 ] has completed
2014-02-19 18:02:42.393: [  CRSPE][2767141744] UI Command [Stat Resource : 0xb1df7aa0] is replying to sender.
2014-02-19 18:02:42.395: [UiServer][2762939248] Done for ctx=0xa8be6268
2014-02-19 18:02:42.417: [UiServer][2756705136] Closed: remote end failed/disc.
2014-02-19 18:02:45.055: [UiServer][2756705136] S(0xa7fd3958): set Properties ( grid,0xb49c820)
2014-02-19 18:02:45.055: [UiServer][2756705136] S(0xa8bae2d8): Accepted client connection: saddr =(ADDRESS=(PROTOCOL=ipc)(DEV=36)(KEY=CRSD_UI_SOCKET))daddr = (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
2014-02-19 18:02:45.066: [UiServer][2762939248] processMessage called
2014-02-19 18:02:45.066: [UiServer][2762939248] Sending message to PE. ctx= 0xa8be8d90
2014-02-19 18:02:45.067: [UiServer][2762939248] Sending command to PE: 52
2014-02-19 18:02:45.067: [  CRSPE][2767141744] Processing PE command id=159. Description: [Stat Resource : 0xa45323b0]
2014-02-19 18:02:45.092: [  CRSPE][2767141744] PE Command [ Stat Resource : 0xa45323b0 ] has completed
2014-02-19 18:02:45.093: [  CRSPE][2767141744] UI Command [Stat Resource : 0xa45323b0] is replying to sender.
2014-02-19 18:02:45.107: [UiServer][2762939248] Done for ctx=0xa8be8d90
2014-02-19 18:02:45.115: [UiServer][2756705136] Closed: remote end failed/disc.
2014-02-19 18:02:46.416: [UiServer][2756705136] S(0xa7fd3958): set Properties ( grid,0xb49c788)
2014-02-19 18:02:46.416: [UiServer][2756705136] S(0xa8bae2d8): Accepted client connection: saddr =(ADDRESS=(PROTOCOL=ipc)(DEV=36)(KEY=CRSD_UI_SOCKET))daddr = (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
2014-02-19 18:02:46.427: [UiServer][2762939248] processMessage called
2014-02-19 18:02:46.428: [UiServer][2762939248] Sending message to PE. ctx= 0xa8bb23b0
2014-02-19 18:02:46.428: [UiServer][2762939248] Sending command to PE: 53
2014-02-19 18:02:46.428: [  CRSPE][2767141744] Processing PE command id=160. Description: [Stat Resource : 0xa453f3b8]
2014-02-19 18:02:46.436: [  CRSPE][2767141744] PE Command [ Stat Resource : 0xa453f3b8 ] has completed
2014-02-19 18:02:46.437: [  CRSPE][2767141744] UI Command [Stat Resource : 0xa453f3b8] is replying to sender.
2014-02-19 18:02:46.438: [UiServer][2762939248] Done for ctx=0xa8bb23b0
2014-02-19 18:02:46.460: [UiServer][2756705136] Closed: remote end failed/disc.
查看相關服務啟動情況
[grid@myrac1 ohasd]$ ps -ef|grep cssd
grid      5402  4816  0 18:15 pts/3    00:00:00 grep cssd
[grid@myrac1 ohasd]$ ps -ef|grep has
grid      2857    1  1 17:29 ?        00:00:34 /g01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid      5432  4816  0 18:16 pts/3    00:00:00 grep has
[grid@myrac1 ohasd]$ ps -ef|grep d.bin
grid      2857    1  1 17:29 ?        00:00:34 /g01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid      3028    1  0 17:31 ?        00:00:01 /g01/app/grid/product/11.2.0/grid/bin/tnslsnr LISTENER -inherit
grid      3140    1  0 17:32 ?        00:00:20 /g01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid      3206    1  0 17:32 ?        00:00:04 /g01/app/grid/product/11.2.0/grid/bin/cssdagent
grid      3240    1  0 17:32 ?        00:00:03 /g01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
grid      3253    1  0 17:32 ?        00:00:14 /g01/app/grid/product/11.2.0/grid/bin/diskmon.bin -d -f
grid      5461  4816  1 18:17 pts/3    00:00:00 grep d.bin

發現has服務沒有啟動,按理來說是開機自啟動,應該會自動執行init.ohasd run命令
[grid@myrac1 ohasd]$ cat /etc/inittab |grep ohasd
h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1
O官方的一段解釋:
With Oracle Clusterware 11g Release 2 (11.2), cluster commands have been introduced that allow stopping the cluster stack on a remote note (as opposed to stopping it locally only with the commands listed above). In order to stop the cluster stack with the exception of the Oracle High Availability Services (OHAS, daemon OHASD), use :crsctl stop cluster -all      #stops the cluster layer on all servers in the cluster crsctl stop cluster -n   #stops the cluster layer the named server
在11g, ohasd包含了crsd、ocssd、evmd.11g cluster分兩層, lower stack和higher stack.
ohasd負責啟動lower stack的叢集資源, crsd負責啟動上層的叢集資源.

既然ohasd服務沒有啟動,於是手工啟動
[root@myrac1 ~]# /etc/init.d/init.ohasd run
mkfifo: cannot create fifo `/var/tmp/.oracle/npohasd': File exists
一直在運行,沒有終止的,很像tomcat的運行方式,不過可以讓它在後台運行,加&即可。
過一會兒查看資源啟動情況
[grid@myrac1 ohasd]$ crsctl status res -t
--------------------------------------------------------------------------------
NAME          TARGET  STATE        SERVER                  STATE_DETAILS     
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA_DG.dg   ONLINE  ONLINE      myrac1                                     
ora.DG_FRA.dg    ONLINE  ONLINE      myrac1                                     
ora.LISTENER.lsnr ONLINE  ONLINE      myrac1                                     
ora.SYS_DG.dg    ONLINE  ONLINE      myrac1                                     
ora.asm          ONLINE  ONLINE      myrac1                  Started           
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd        ONLINE  ONLINE      myrac1                                     
ora.diskmon      ONLINE  ONLINE      myrac1                                     
ora.hjj.db       OFFLINE OFFLINE                              Instance Shutdown 
[grid@myrac1 ohasd]$ crs_stat -t
Name          Type          Target    State    Host       
------------------------------------------------------------
ora.DATA_DG.dg ora....up.type ONLINE    ONLINE    myrac1     
ora.DG_FRA.dg  ora....up.type ONLINE    ONLINE    myrac1     
ora....ER.lsnr ora....er.type ONLINE    ONLINE    myrac1     
ora.SYS_DG.dg  ora....up.type ONLINE    ONLINE    myrac1     
ora.asm        ora.asm.type  ONLINE    ONLINE    myrac1     
ora.cssd      ora.cssd.type  ONLINE    ONLINE    myrac1     
ora.diskmon    ora....on.type ONLINE    ONLINE    myrac1     
ora.hjj.db    ora....se.type OFFLINE  OFFLINE       
只有ora.hjj.db是offline的,因為還沒有啟動資料庫,啟動之後就會變成ONLINE。
啟動ASM執行個體
[grid@myrac1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on Wed Feb 19 17:34:03 2014

Copyright (c) 1982, 2009, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Automatic Storage Management option

SQL> startup
ORA-01081: cannot start already-running ORACLE - shut it down first
SQL> shutdown immediater
SP2-0717: illegal SHUTDOWN option
SQL> shutdown immediate
ASM diskgroups dismounted
ASM instance shutdown
SQL> startup
ASM instance started

Total System Global Area  284565504 bytes
Fixed Size                  1336036 bytes
Variable Size            258063644 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted


啟動資料庫
[oracle@myrac1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Wed Feb 19 18:40:55 2014

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  313860096 bytes
Fixed Size                  1336232 bytes
Variable Size            130026584 bytes
Database Buffers          176160768 bytes
Redo Buffers                6336512 bytes
Database mounted.
Database opened.
查看資源啟動情況
[grid@myrac1 ~]$ crs_stat -t
Name          Type          Target    State    Host       
------------------------------------------------------------
ora.DATA_DG.dg ora....up.type ONLINE    ONLINE    myrac1     
ora.DG_FRA.dg  ora....up.type ONLINE    ONLINE    myrac1     
ora....ER.lsnr ora....er.type ONLINE    ONLINE    myrac1     
ora.SYS_DG.dg  ora....up.type ONLINE    ONLINE    myrac1     
ora.asm        ora.asm.type  ONLINE    ONLINE    myrac1     
ora.cssd      ora.cssd.type  ONLINE    ONLINE    myrac1     
ora.diskmon    ora....on.type ONLINE    ONLINE    myrac1     
ora.hjj.db    ora....se.type ONLINE    ONLINE    myrac1 

如果發現有個資源沒有啟動,可以使用crsctl start res resource_name進行啟動.
至此問題得到解決。

總結:在啟動服務的時候,要時刻關注後台日誌,做了哪些動作,這樣才能清楚知道在哪個環節出錯,以便快速定問題,解決問題。

附:常用命令
檢查has的啟動狀態
crsctl check has
檢查css的啟動狀態
crsctl check css
檢查資源的啟動情況
crs_stat -t -v
crsctl status res -t
啟動某個資源
crsctl start res resource_name
ocr資訊
ocrcheck
查看資料庫hjj的配置資訊
srvctl config database -d hjj
查看某個資源的參數
crs_stat -p ora.hjj.db
crsctl命令的用法
[grid@myrac1 ~]$ crsctl -h
Usage: crsctl add      - add a resource, type or other entity
      crsctl check    - check a service, resource or other entity
      crsctl config    - output autostart configuration
      crsctl debug    - obtain or modify debug state
      crsctl delete    - delete a resource, type or other entity
      crsctl disable  - disable autostart
      crsctl enable    - enable autostart
      crsctl get      - get an entity value
      crsctl getperm  - get entity permissions
      crsctl lsmodules - list debug modules
      crsctl modify    - modify a resource, type or other entity
      crsctl query    - query service state
      crsctl pin      - Pin the nodes in the nodelist
      crsctl relocate  - relocate a resource, server or other entity
      crsctl replace  - replaces the location of voting files
      crsctl setperm  - set entity permissions
      crsctl set      - set an entity value
      crsctl start    - start a resource, server or other entity
      crsctl status    - get status of a resource or other entity
      crsctl stop      - stop a resource, server or other entity
      crsctl unpin    - unpin the nodes in the nodelist
      crsctl unset    - unset a entity value, restoring its default

oracle高可用服務發布版本
crsctl query has releaseversion
oracle高可用服務版本
crsctl query has softwareversion

相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.