CRS無法啟動,運行crsctl start crs無響應
查看messages:
[root@UNID02 ~]# tail -f /var/log/messages
Dec 9 08:11:13 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7259.
Dec 9 08:12:13 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7468.
Dec 9 08:12:13 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7356.
Dec 9 08:12:13 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7259.
Dec 9 08:13:13 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7468.
Dec 9 08:13:13 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7356.
Dec 9 08:13:14 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7259.
Dec 9 08:14:13 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7468.
Dec 9 08:14:13 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7356.
Dec 9 08:14:14 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7259.
CRS無法啟動的原因為:Cluster Ready Services waiting on dependencies
看來有相依元件沒有起來。
查看更詳細的資訊:
[root@UNID02 ~]# less /tmp/crsctl.7259
Oracle Cluster Registry initialization failed accessing Oracle Cluster Registry device: PROC-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]
[root@UNID02 ~]# service rawdevices status
/dev/raw/raw5: bound to major 120, minor 97
/dev/raw/raw6: bound to major 120, minor 113
/dev/raw/raw7: bound to major 120, minor 129
發現只起來了三個RAW,於是將其餘的RAW起起來。
[root@UNID02 ~]# service rawdevices start
Assigning devices:
/dev/raw/raw1 --> /dev/emcpowera1
/dev/raw/raw1: bound to major 120, minor 1
/dev/raw/raw2 --> /dev/emcpowerb1
/dev/raw/raw2: bound to major 120, minor 17
/dev/raw/raw3 --> /dev/emcpowerc1
/dev/raw/raw3: bound to major 120, minor 33
/dev/raw/raw4 --> /dev/emcpowerd1
/dev/raw/raw4: bound to major 120, minor 49
/dev/raw/raw5 --> /dev/emcpowerg1
/dev/raw/raw5: bound to major 120, minor 97
/dev/raw/raw6 --> /dev/emcpowerh1
/dev/raw/raw6: bound to major 120, minor 113
/dev/raw/raw7 --> /dev/emcpoweri1
/dev/raw/raw7: bound to major 120, minor 129
done
[root@UNID02 ~]#
為什麼這些RAW沒起來呢?原因為儲存發生了故障重啟,重啟後disk掛載上主機,但是並沒有作為RAW被識別出來。
於是再次啟動rawservices。
[root@UNID02 ~]# service rawdevices start
Assigning devices:
/dev/raw/raw1 --> /dev/emcpowera1
/dev/raw/raw1: bound to major 120, minor 1
/dev/raw/raw2 --> /dev/emcpowerb1
/dev/raw/raw2: bound to major 120, minor 17
/dev/raw/raw3 --> /dev/emcpowerc1
/dev/raw/raw3: bound to major 120, minor 33
/dev/raw/raw4 --> /dev/emcpowerd1
/dev/raw/raw4: bound to major 120, minor 49
/dev/raw/raw5 --> /dev/emcpowerg1
/dev/raw/raw5: bound to major 120, minor 97
/dev/raw/raw6 --> /dev/emcpowerh1
/dev/raw/raw6: bound to major 120, minor 113
/dev/raw/raw7 --> /dev/emcpoweri1
/dev/raw/raw7: bound to major 120, minor 129
done
[root@UNID02 ~]#
繼續查看Messages,發現繼續報之前的錯誤:
Dec 9 08:15:14 UNID02 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.7259.
看來啟動過程中CRS發現還是存在問題,於是再次查看crsctl.*檔案:
[root@UNID02 ~]# cat /tmp/crsctl.7259
Oracle Cluster Registry initialization failed accessing Oracle Cluster Registry device: PROC-26: Error while accessing the physical storage Operating System error [Permission denied] [13]
[root@UNID02 ~]#
還是一樣的報錯,但是這回由[No such file or directory]變成了 [Permission denied]。
肯定是RAW的許可權由問題,查看RAW的許可權:
[root@UNID02 ~]# ll /dev/raw
total 0
crw------- 1 root root 162, 1 Dec 9 08:14 raw1
crw------- 1 root root 162, 2 Dec 9 08:14 raw2
crw------- 1 root root 162, 3 Dec 9 08:14 raw3
crw------- 1 root root 162, 4 Dec 9 08:14 raw4
crw------- 1 root root 162, 5 Dec 9 08:14 raw5
crw------- 1 root root 162, 6 Dec 9 08:14 raw6
crw------- 1 root root 162, 7 Dec 9 08:14 raw7
果然如此,修改RAW的屬主為oracle:
[root@UNID02 ~]# chown oracle:dba /dev/raw/*
發現messages裡已經有CRS啟動的資訊了:
Dec 9 08:16:14 UNID02 logger: Cluster Ready Services completed waiting on dependencies.
Dec 9 08:16:14 UNID02 last message repeated 2 times
Dec 9 08:16:14 UNID02 logger: Running CRSD with TZ =
Dec 9 08:16:14 UNID02 logger: Oracle CSS Family monitor starting.
Dec 9 08:16:15 UNID02 logger: Oracle CSS restart. 0, 1
以為問題到此結束,但等了一會兒,發現asm和instance一直沒起來:
[oracle@UNID02 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE OFFLINE
ora....01.lsnr application ONLINE OFFLINE
ora....t01.gsd application ONLINE OFFLINE
ora....t01.ons application ONLINE OFFLINE
ora....t01.vip application ONLINE OFFLINE
ora....SM2.asm application ONLINE OFFLINE
ora....02.lsnr application ONLINE ONLINE UNID02
ora....t02.gsd application ONLINE ONLINE UNID02
ora....t02.ons application ONLINE ONLINE UNID02
ora....t02.vip application ONLINE ONLINE UNID02
ora....RTAL.cs application ONLINE OFFLINE
ora....ac1.srv application ONLINE OFFLINE
ora....ac2.srv application ONLINE OFFLINE
ora.rac.db application ONLINE OFFLINE
ora....c1.inst application ONLINE OFFLINE
ora....c2.inst application ONLINE OFFLINE
ora....rwss.cs application ONLINE OFFLINE
ora....ac1.srv application ONLINE OFFLINE
ora....ac2.srv application ONLINE OFFLINE
ora...._taf.cs application ONLINE OFFLINE
ora....ac1.srv application ONLINE OFFLINE
ora....ac2.srv application ONLINE OFFLINE
ora....test.cs application ONLINE OFFLINE
ora....ac1.srv application ONLINE OFFLINE
ora....rac1.cs application ONLINE OFFLINE
ora....ac1.srv application ONLINE OFFLINE
ora....ac2.srv application ONLINE OFFLINE
ora....rac2.cs application ONLINE OFFLINE
ora....ac1.srv application ONLINE OFFLINE
ora....ac2.srv application ONLINE OFFLINE
[oracle@UNID02 ~]$
於是手動啟動ASM執行個體:
[oracle@UNID02 ~]$ dba
SQL*Plus: Release 11.1.0.7.0 - Production on Mon Dec 9 08:25:06 2013
Copyright (c) 1982, 2008, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:check if cable failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpcini1
ORA-27303: additional information: requested interface eth1 interface not running set _disable_interface_checking = TRUE to disable this check for single instance cluster. Check output from ifcon
SQL>
SQL>
SQL> quit
Disconnected
報這個錯誤是因為RAC中的ASM和INSTANCE在啟動的時候會通過私人網路去檢查其他節點的私人網路資訊,此時另一個節點是關機的。
解決辦法是在asm參數檔案中設定以下2個隱含參數:
_disable_instance_params_check = TRUE
_disable_interface_checking = TRUE
_disable_instance_params_check = TRUE的意義是在執行個體啟動時忽略instance_type的值,而disable_interface_checking參數僅用��db的參數檔案,用於asm執行個體時會報錯ORA-15021: parameter "_disable_interface_checking" is not valid in asm instance,所以此處設定_disable_instance_params_check = TRUE用於略過instance_type的檢查。
在執行個體參數檔案中設定以下隱含參數:
_disable_interface_checking = TRUE
繼續啟動asm和instance,可以正常啟動。