測試oracle 11g cluster 中OLR的重要性
called an Oracle Local Registry (OLR): each node in a cluster has a local registry for node-specific resources
測試一:類比olr異常丟失的情況:
這裡首先將olr renam
[root@vmrac2 cdata]# mv vmrac2.olr vmrac2.olr.bak
然後嘗試去啟動crs
[root@vmrac2 cdata]# crsctl start crs
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
然後我們觀察下 叢集alert log的日誌輸出情況:
[grid@vmrac2 vmrac2]$ tailf alertvmrac2.log
[ohasd(2495)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while
accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in
/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.
2014-06-16 16:51:59.491
[ohasd(2506)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while
accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in
/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.
2014-06-16 16:51:59.698
[ohasd(2517)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while
accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in
/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.
2014-06-16 16:51:59.901
[ohasd(2528)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while
accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in
/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.
2014-06-16 16:52:00.113
[ohasd(2539)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while
accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in
/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.
[client(2554)]CRS-10001:CRS-10132: No msg for has:crs-10132 [10][60]
2014-06-16 16:56:00.720
[ohasd(2717)]CRS-2112:The OLR service started on node vmrac2.
2014-06-16 16:56:00.788
[ohasd(2717)]CRS-1301:Oracle High Availability Service started on node vmrac2.
2014-06-16 16:56:00.855
[ohasd(2717)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors
occurred
2014-06-16 16:56:01.836
[/u02/app/11.2.0.3/grid/bin/orarootagent.bin(2768)]CRS-5016:Process "/u02/app/11.2.0.3/grid/bin/acfsload" spawned by agent
"/u02/app/11.2.0.3/grid/bin/orarootagent.bin" for action "check" failed: details at "(:CLSN00010:)" in
"/u02/app/11.2.0.3/grid/log/vmrac2/agent/ohasd/orarootagent_root/orarootagent_root.log"
2014-06-16 16:56:19.876
[ohasd(2717)]CRS-2302:Cannot get GPnP profile.Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2014-06-16 16:56:19.909
[gpnpd(2873)]CRS-2328:GPNPD started on node vmrac2.
2014-06-16 16:56:22.751
[cssd(2947)]CRS-1713:CSSD daemon is started in clustered mode
2014-06-16 16:56:24.073
[ohasd(2717)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2014-06-16 16:56:32.512
[cssd(2947)]CRS-1707:Lease acquisition for node vmrac2 number 2 completed
2014-06-16 16:56:33.798
[cssd(2947)]CRS-1605:CSSD voting file is online: ORCL:CRSVOL1; details in /u02/app/11.2.0.3/grid/log/vmrac2/cssd/ocssd.log.
2014-06-16 16:56:40.342
[cssd(2947)]CRS-1601:CSSD Reconfiguration complete. Active nodes are vmrac1 vmrac2 .
2014-06-16 16:56:42.635
[ctssd(3009)]CRS-2401:The Cluster Time Synchronization Service started on host vmrac2.
2014-06-16 16:56:42.635
[ctssd(3009)]CRS-2407:The new Cluster Time Synchronization Service reference node is host vmrac1.
2014-06-16 16:56:46.726
[ctssd(3009)]CRS-2408:The clock on host vmrac2 has been updated by the Cluster Time Synchronization Service to be
synchronous with the mean cluster time.
[client(3047)]CRS-10001:16-Jun-14 16:56 ACFS-9391: Checking for existing ADVM/ACFS installation.
[client(3060)]CRS-10001:16-Jun-14 16:56 ACFS-9392: Validating ADVM/ACFS installation files for operating system.
[client(3062)]CRS-10001:16-Jun-14 16:56 ACFS-9393: Verifying ASM Administrator setup.
[client(3065)]CRS-10001:16-Jun-14 16:56 ACFS-9308: Loading installed ADVM/ACFS drivers.
[client(3069)]CRS-10001:16-Jun-14 16:56 ACFS-9154: Loading 'oracleoks.ko' driver.
[client(3080)]CRS-10001:16-Jun-14 16:56 ACFS-9154: Loading 'oracleadvm.ko' driver.
[client(3096)]CRS-10001:16-Jun-14 16:56 ACFS-9154: Loading 'oracleacfs.ko' driver.
[client(3180)]CRS-10001:16-Jun-14 16:56 ACFS-9327: Verifying ADVM/ACFS devices.
[client(3183)]CRS-10001:16-Jun-14 16:56 ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
[client(3187)]CRS-10001:16-Jun-14 16:56 ACFS-9156: Detecting control device '/dev/ofsctl'.
[client(3193)]CRS-10001:16-Jun-14 16:56 ACFS-9322: completed
測試二:清空olr的內容,使用一個空檔案來代替:
觀察alert.log內容如下:
[ohasd(5451)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while
accessing the physical storage]. Details at (:OHAS00106:) in /u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.
2014-06-16 17:19:02.723
[ohasd(5462)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while
accessing the physical storage]. Details at (:OHAS00106:) in /u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.
[client(5477)]CRS-10001:CRS-10132: No msg for has:crs-10132 [10][60]
觀察相應的ohasd.log 日誌的內容:
[grid@vmrac2 vmrac2]$ tail -300 /u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log
2014-06-16 17:19:02.722: [ OCROSD][1923920288]utread:3: Problem reading buffer 150c4000 buflen 4096 retval 0 phy_offset
102400 retry 5
2014-06-16 17:19:02.722: [ OCRRAW][1923920288]propriogid:1_1: Failed to read the whole bootblock. Assumes invalid format.
2014-06-16 17:19:02.722: [ OCRRAW][1923920288]proprioini: all disks are not OCR/OLR formatted
2014-06-16 17:19:02.722: [ OCRRAW][1923920288]proprinit: Could not open raw device
2014-06-16 17:19:02.722: [ OCRAPI][1923920288]a_init:16!: Backend init unsuccessful : [26]
2014-06-16 17:19:02.723: [ CRSOCR][1923920288] OCR context init failure. Error: PROCL-26: Error while accessing the
physical storage
2014-06-16 17:19:02.723: [ default][1923920288] Created alert : (:OHAS00106:) : OLR initialization failed, error: PROCL-
26: Error while accessing the physical storage
2014-06-16 17:19:02.723: [ default][1923920288][PANIC] OHASD exiting; Could not init OLR
2014-06-16 17:19:02.723: [ default][1923920288] Done
總結:
根據以上測試 可以發現ohasd (Oracle High Availability Service) 依賴於 olr (Oracle Local Registry)中的配置資訊 如果olr 異
常,或者丟失都會導致ohasd 進程啟動失敗。