Failed to upgrade Oracle Cluster Registry configuration(root.sh),clusterroot.sh
最近在給客戶基於Suse 11 sp3安裝Oracle 10g RAC,在安裝完clusterware執行/u01/app/crs/root.sh時收到錯誤提示,Failed to upgrade Oracle Cluster Registry configuration由於當前的環境使用了多重路徑,從Oracle的描述來看,這是一個Oracle Bug(4679769),如果你有相同的問題,請接著往下看。
一、故障現象
suse11a:/u01/app/crs # /u01/app/crs/root.sh
WARNING: directory '/u01/app' is not owned by root
Checking to see if Oracle CRS stack is already configured
/etc/oracle does not exist. Creating it now.
Setting the permissions on OCR backup directory
Setting up NS directories
Failed to upgrade Oracle Cluster Registry configuration #此處為錯誤提示
#下面使用clsfmt命令時提示Received unexpected error,注,/u01/app/crs 為ORA_CRS_HOME。
suse11a:/ # /u01/app/crs/bin/clsfmt ocr /dev/raw/raw1
clsfmt: Received unexpected error 4 from skgfifi
skgfifi: Additional information: -2
Additional information: 1073741824
#下面是具體的錯誤記錄檔
suse11a:/u01/app/crs/log/suse11a/client # pwd
/u01/app/crs/log/suse11a/client
suse11a:/u01/app/crs/log/suse11a/client # more ocrconfig_24066.log
Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle. All rights reserved.
2014-08-11 11:52:14.993: [ OCRCONF][2176517888]ocrconfig starts...
2014-08-11 11:52:14.994: [ OCRCONF][2176517888]Upgrading OCR data
2014-08-11 11:52:15.100: [ OCRRAW][2176517888]propriogid:1: INVALID FORMAT
2014-08-11 11:52:15.101: [ OCRRAW][2176517888]ibctx:1:ERROR: INVALID FORMAT
2014-08-11 11:52:15.101: [ OCRRAW][2176517888]proprinit:problem reading the bootblock or superbloc 22
2014-08-11 11:52:15.102: [ default][2176517888]a_init:7!: Backend init unsuccessful : [22]
2014-08-11 11:52:15.102: [ OCRCONF][2176517888]Exporting OCR data to [OCRUPGRADEFILE]
2014-08-11 11:52:15.102: [ OCRAPI][2176517888]a_init:7!: Backend init unsuccessful : [33]
2014-08-11 11:52:15.102: [ OCRCONF][2176517888]There was no previous version of OCR. error:[PROC-33: Oracle Cluster Registry is not configured]
2014-08-11 11:52:15.108: [ OCRRAW][2176517888]propriogid:1: INVALID FORMAT
2014-08-11 11:52:15.108: [ OCRRAW][2176517888]ibctx:1:ERROR: INVALID FORMAT
2014-08-11 11:52:15.108: [ OCRRAW][2176517888]proprinit:problem reading the bootblock or superbloc 22
2014-08-11 11:52:15.108: [ default][2176517888]a_init:7!: Backend init unsuccessful : [22]
2014-08-11 11:52:15.113: [ OCRRAW][2176517888]propriogid:1: INVALID FORMAT
2014-08-11 11:52:15.113: [ OCRRAW][2176517888]ibctx:1:ERROR: INVALID FORMAT
2014-08-11 11:52:15.113: [ OCRRAW][2176517888]proprinit:problem reading the bootblock or superbloc 22
2014-08-11 11:52:15.118: [ OCRRAW][2176517888]propriogid:1: INVALID FORMAT
2014-08-11 11:52:15.126: [ OCRRAW][2176517888]propriowv: Vote information on disk 0 [/dev/raw/raw1] is adjusted from [0/0] to [2/2]
2014-08-11 11:52:15.137: [ OCRRAW][2176517888]propriniconfig:No 92 configuration
2014-08-11 11:52:15.137: [ OCRAPI][2176517888]a_init:6a: Backend init successful
2014-08-11 11:52:15.165: [ OCRCONF][2176517888]Initialized DATABASE keys in OCR
2014-08-11 11:52:15.176: [ OCRCONF][2176517888]csetskgfrblock0: clsfmt returned with error [4].
2014-08-11 11:52:15.176: [ OCRCONF][2176517888]Failure in setting block0 [-1]
2014-08-11 11:52:15.176: [ OCRCONF][2176517888]OCR block 0 is not set !
2014-08-11 11:52:15.176: [ OCRCONF][2176517888]Exiting [status=failed]...
二、解決故障
#由於該故障是使用多重路徑時產生的一個Bug,因此直接參考DocID 466673.1予以解決
#下面是下載補丁4679769之後步驟
suse11a:/robin # unzip p4679769_10201_Linux-x86-64.zip #解壓補丁
Archive: p4679769_10201_Linux-x86-64.zip
creating: 4679769/
inflating: 4679769/README.txt
inflating: 4679769/clsfmt.bin
suse11a:/robin # cp /u01/app/crs/bin/clsfmt.bin /u01/app/crs/bin/clsfmt.bin.bak
suse11a:/robin # cp ./4679769/clsfmt.bin /u01/app/crs/bin/clsfmt.bin #覆蓋原檔案(注該操作僅在安裝節點執行即可)
suse11a:/robin # chmod 755 /u01/app/crs/bin/clsfmt.bin #授予許可權
suse11a:/robin # /u01/app/crs/bin/clsfmt.bin ocr /dev/raw/raw1 #使用clsfmt.bin驗證成功
# Author : Leshami
# Blog : http://blog.csdn.net/leshami
#下面使用dd命令清除ocr 與votingdisk 磁碟(當前的2個裸裝置大小為1G)
#注意一定要dd,否則root.sh依舊不能成功
suse11a:~ # dd if=/dev/zero of=/dev/raw/raw1 bs=1024k count=800
800+0 records in
800+0 records out
838860800 bytes (839 MB) copied, 2.64104 s, 318 MB/s
suse11a:~ # dd if=/dev/zero of=/dev/raw/raw2 bs=1024k count=800
800+0 records in
800+0 records out
838860800 bytes (839 MB) copied, 3.21852 s, 261 MB/s
#再次使用clsfmt.bin驗證成功
clsfmt: successfully initialized file /dev/raw/raw1
suse11a:/robin # /u01/app/crs/bin/clsfmt.bin ocr /dev/raw/raw2
clsfmt: successfully initialized file /dev/raw/raw2
#再次自行root.sh成功
suse11a:/robin # /u01/app/crs/root.sh
三、DocID 466673.1
APPLIES TO:
Oracle Database - Enterprise Edition - Version 10.2.0.1 and later
Linux x86
IBM: Linux on POWER Systems
Linux x86-64
Linux Itanium
***Checked for relevance on 11-Mar-2013***
SYMPTOMS
On a new clusterware installation on Linux root.sh script is failing with the following error while running root.sh on the first node:
PROT-1: Failed to initialize ocrconfig
Failed to upgrade Oracle Cluster Registry configuration
The problem can be tracked down to clsfmt command:
./clsfmt ocr /dev/raw/raw1
clsfmt: Received unexpected error 4 from skgfifi
skgfifi: Additional information: -2
Additional information: 1000718336
CHANGES
It has been found that the following changes can cause this problem to occur:
1. Use Mutiple Path (MP) disk configuration, may hit this issue.
2. Use EMC device (powerpath**) may hit this issue.
But it was not confirmed that these are the only things that can cause this problem to occur, as it has been found that on other hardware and configuration the problem might occur, the key change in this issue is that if the disk size presented from the storage to the cluster nodes are not dividable by 4K the problem should occur.
CAUSE
This issue is addressed in Bug:4679769 which states that this is a known issue with the clusterware installation on platforms: Linux x86, x86-64 and "IBM Power Based Linux".
SOLUTION
Before running the root.sh on the first node in the cluster do the following:
1. Download Patch:4679769 from Metalink (contains a patched version of clsfmt.bin).
2. Do the following steps as stated in the patch README to fix the problem:
Note: clsfmt.bin need only be replaced on the 1st node of the cluster
# Patch Installation Instructions:
# --------------------------------
# To apply the patch, unzip the PSE container file:
#
# p4679769_10201_LINUX.zip
#
# Set your current directory to the directory where the patch
# is located:
#
# % cd 4679769
#
# Copy the clsfmt.bin binary to the $ORACLE_HOME/bin directory where
# clsfmt is being run:
#
# % cp $ORACLE_HOME/bin/clsfmt.bin $ORACLE_HOME/bin/clsfmt.bin.bak
# % cp clsfmt.bin $ORACLE_HOME/bin/clsfmt.bin
#
# Ensure permissions on the clsfmt.bin binary are correct:
#
# % chmod 755 $ORACLE_HOME/bin/clsfmt.bin
3. Run the root.sh script and proceed with the installation.
高人指點安裝ORACLE 11G RAC最後在兩節點執行 rootsh指令碼時報錯解決方案,如解決高分回報,感激不盡
叢集的cluster 沒有安裝成功!
ora.crsd 組件沒有啟動起來!
安裝的時候你注意一下你的分區
仲裁分區 資料分區 備份分區 重裝後 一定要 重新格式化!
否者安裝失敗 grid 組件 會出現更重各樣的問題.
很明顯你的叢集安裝失敗了
打命令
tail -500 /u01/app/11.2.0/grid/cfgtoollogs/crsconfig/rootcrs_rac1.log
把資訊貼出來
Oracle Cluster Registry是什
叢集註冊表(Oracle Cluster Registry:OCR):維護叢集的配置資訊及叢集中任何叢集資料庫的配置資訊。OCR也管理那些Oracle Clusterware控制的相關進程的相關資訊。OCR將配置資訊儲存在一個分類樹結構中的一系列關鍵字-值對中(key-value pairs)。OCR必須位於叢集中所有結點可同時訪問的共用磁碟中。Oracle Clusterware可使用多重OCR(multiplex或稱為多工)。Oracle建議使用此特性以保證高可用性。你可以在聯機時替換一個故障的OCR,可通過支援的API(如企業管理器、srvctl及dbca)來更新OCR。