(Document ID 1062983.1) Applies To:
Oracle database-enterprise edition-version 11.2.0.1.0 to 11.2.0.4 [Release 11.2]
Information in this document applies to any platform.
Goal
It is not possible to directly restore a manual or automatic OCR backup if the OCR are located in an ASM disk group. This was caused by the fact, the command ' Ocrconfig-restore ' requires ASM to being up & running in order to restore a n OCR Backup to an ASM disk group. However, for ASM-to-be available, the CRS stack must has been successfully started. For the "Restore to Succeed", the OCR also must not being in use (r/w), i.e. no CRS daemon must being running while the OCR is Bei Ng restored.
A Description of the procedure to restore the OCR can is found in the documentation, this document explains how T o Recover from a complete loss of the ASM Disk group this held the OCR and voting files in a 11gR2 Grid environment.
Solution
When using a ASM disk group for CRS there is typically 3 different types of files located in the disk group that Potenti Ally need to be restored/recreated:
The following example assumes that the OCR is located in a single disk group used exclusively for CRS. The disk group has just one disk using external redundancy.
Since The CRS disk group has been lost the CRS stack is not being available on any node.
The following settings used in the example would need to being replaced according to the actual configuration:
grid user: Oragrid
GRID home: /u01/app /11.2.0/grid ($CRS _home)
ASM Disk group name for ocr: CRS
Asm/asmlib disk Name: &nbs p; ASMD40
Linux device name for ASM disk: /DEV/SDH1
Cluster name: Rac_cluster1
nodes: Racnode1, Racnode2
This document assumes the name of the the OCR DiskGroup remains unchanged, however there could be a need to use a different DiskGroup name, in which case the name of the the the OCR DiskGroup would has to be modified in/etc/oracle/ocr.loc across all n Odes prior to executing the following steps.
1. Locate The latest automatic OCR backup
When using a non-shared CRS home, automatic OCR backups can is located on any node of the cluster, consequently all nodes Need to being checked for the most recent backup:
$ LS-LRT $CRS _home/cdata/rac_cluster1/
-RW-------1 root root 7331840 Mar 18:52 WEEK.OCR
-RW-------1 root root 7651328 Mar 01:33 WEEK_.OCR
-RW-------1 root root 7651328 Mar 01:33 DAY.OCR
-RW-------1 root root 7651328 Mar 01:33 DAY_.OCR
-RW-------1 root root 7651328 Mar 01:33 BACKUP02.OCR
-RW-------1 root root 7651328 Mar 05:33 BACKUP01.OCR
-RW-------1 root root 7651328 Mar 09:33 BACKUP00.OCR
2. Make sure the Grid Infrastructure are shutdown on all nodes
Given that the OCR diskgroup is missing, the GI stack would not be functional on any node, however there could still be Vario US daemon processes running. On each node shutdown the GI stack using the force (-f) Option:
# $CRS _home/bin/crsctl Stop Crs-f
3. Start the CRS stack in exclusive mode
On the node is the most recent OCR backup, log on as root and start CRS in exclusive mode, this mode would allow ASM To start & stay up without the presence of a voting disk and without the CRS daemon process (crsd.bin) running.
11.2.0.1:
# $CRS _home/bin/crsctl start CRS -excl
...
Crs-2672:attempting to start ' ora.asm ' on ' racnode1 '
Crs-2676:start of ' ora.asm ' on ' Racnode1 ' succeeded
Crs-2672:attempting to start ' ora.crsd ' on ' racnode1 '
Crs-2676:start of ' ora.crsd ' on ' Racnode1 ' succeeded
Please note:
This document assumes the CRS DiskGroup is completely lost, in which case the CRS Daemon (resource ORA.CRSD) would t Erminate again due to the inaccessibility of the Ocr-even if above message indicates that the start succeeded.
If this is not the case-i.e. If the CRS DiskGroup are still present (but corrupt or incorrect) the CRS daemon needs to be Shutdown manually using:
# $CRS _home/bin/crsctl Stop Res ora.crsd-init
Otherwise the subsequent OCR restore would fail.
11.2.0.2 and above:
# $CRS _home/bin/crsctl start CRS -excl-nocrs
Crs-4123:oracle High Availability Services have been started.
...
Crs-2672:attempting to start ' ora.cluster_interconnect.haip ' on ' auw2k3 '
Crs-2672:attempting to start ' ora.ctssd ' on ' racnode1 '
Crs-2676:start of ' ora.drivers.acfs ' on ' Racnode1 ' succeeded
Crs-2676:start of ' ora.ctssd ' on ' Racnode1 ' succeeded
Crs-2676:start of ' ora.cluster_interconnect.haip ' on ' Racnode1 ' succeeded
Crs-2672:attempting to start ' ora.asm ' on ' racnode1 '
Crs-2676:start of ' ora.asm ' on ' Racnode1 ' succeeded
IMPORTANT:
A new option '-Nocrs' have been introduced with 11.2.0.2, which prevents the start of the ORA.CRSD resource. It is vital that this option is specified, otherwise the failure to start the ORA.CRSD resource would tear down Ora.cluster _INTERCONNECT.HAIP, which in turn would cause ASM to crash.
4. Label the CRS disk for asmlib use
If using Asmlib the disk to being used for the CRS disk group needs to stamped first, as user root does:
#/usr/sbin/oracleasm Createdisk ASMD40/DEV/SDH1
Writing Disk Header:done
Instantiating Disk:done
5. Create the CRS DiskGroup via Sqlplus
The disk group can now is (re-) created via sqlplus from the grid user. The compatible.asm attribute must is set to 11.2 in order for the disk group to being used by CRS:
$ sqlplus/as Sysasm
Sql*plus:release 11.2.0.1.0 Production on Tue Mar 30 11:47:24 2010
Copyright (c) 1982, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0-production
With the Real application Clusters and Automatic Storage Management options
Sql> Create DiskGroup CRS external redundancy disk ' ORCL:ASMD40 ' attribute ' COMPATIBLE. ASM ' = ' 11.2 ';
DiskGroup created.
Sql> exit
6. Restore the latest OCR backup
Now, the CRS Disk group is created & mounted the OCR can be restored-must being done as the root user:
# CD $CRS _home/cdata/rac_cluster1/
# $CRS _home/bin/ocrconfig-restore BACKUP00.OCR
7. Start the CRS daemon on the current node (11.2.0.1 only!)
Now the the OCR have been restored the CRS daemon can be a started, this was needed to recreate the voting file. Skip this step for 11.2.0.2.0.
# $CRS _home/bin/crsctl Start res ora.crsd-init
Crs-2672:attempting to start ' ora.crsd ' on ' racnode1 '
Crs-2676:start of ' ora.crsd ' on ' Racnode1 ' succeeded
8. Recreate the voting file
The voting file needs to is initialized in the CRS disk group:
# $CRS _home/bin/crsctl Replace Votedisk +crs
Successful addition of voting disk 00CAA5B9C0F54F3ABF5BD2A2609F09A9.
Successfully replaced voting disk group with +crs.
crs-4266:voting file (s) successfully replaced
9. Recreate the SPFILE for ASM (optional)
Please note:
Starting with 11gR2 ASM can start without a PFILE or SPFILE
-Not using a SPFILE for ASM
-Not using a gkfx SPFILE for ASM
-Using a shared SPFILE not stored in ASM (e.g. on cluster file system)
This step possibly should is skipped.
Also use extra care in regards to the asm_diskstring parameter as it impacts the discovery of the voting disks.
Please verify the previous settings using the ASM alert log.
Prepare a pfile (e.g./tmp/asm_pfile.ora) with the ASM startup Parameters-these could vary from the example below. If in doubt consult, the ASM alert log as the ASM instance startup should list all Non-default parameter values. Please note that the last startup of ASM (in step 2 via CRS start) is not having used an SPFILE, so a startup prior to the loss Of the CRS disk group would need to be located.
*.asm_power_limit=1
*.diagnostic_dest= '/u01/app/oragrid '
*.instance_type= ' ASM '
*.large_pool_size=12m
*.remote_login_passwordfile= ' EXCLUSIVE '
Now the SPFILE can is created using this PFILE:
$ sqlplus/as Sysasm
Sql*plus:release 11.2.0.1.0 Production on Tue Mar 30 11:52:39 2010
Copyright (c) 1982, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0-production
With the Real application Clusters and Automatic Storage Management options
sql> Create spfile= ' +crs ' from pfile= '/tmp/asm_pfile.ora ';
File created.
Sql> exit
Shutdown CRS
Since CRS is running in exclusive mode, it needs to be shutdown to allow CRS to run on all nodes again. Use of the force (-f) option may required:
# $CRS _home/bin/crsctl Stop Crs-f
...
Crs-2793:shutdown of Oracle High availability services-managed resources on ' auw2k3 ' have completed
Crs-4133:oracle High Availability Services have been stopped.
Rescan ASM Disks
If using Asmlib rescan all ASM disks on each node as the root user:
#/usr/sbin/oracleasm Scandisks
Reloading Disk Partitions:done
Cleaning any stale ASM disks ...
Scanning system for ASM disks ...
Instantiating disk "ASMD40"
Start CRS
As the root user submit the CRS startup on all cluster nodes:
# $CRS _home/bin/crsctl Start CRS
Crs-4123:oracle High Availability Services have been started.
Verify CRS
To verify this CRS is fully functional again:
# $CRS _home/bin/crsctl Check Cluster-all
**************************************************************
Racnode1:
Crs-4537:cluster Ready Services are online
Crs-4529:cluster Synchronization Services is online
Crs-4533:event Manager is online
**************************************************************
Racnode2:
Crs-4537:cluster Ready Services are online
Crs-4529:cluster Synchronization Services is online
Crs-4533:event Manager is online
**************************************************************
# $CRS _HOME/BIN/CRSCTL Status Resource-t
This article is from "the director of the audience, for Me" blog, please be sure to keep this source http://jonsen.blog.51cto.com/4559666/1670491
How to restore ASM based OCR after complete loss of the CRS DiskGroup on Linux/unix Systems