Restoration Method for 10g Clusterware Votedisk corruption, clusterware

Source: Internet
Author: User

Restoration Method for 10g Clusterware Votedisk corruption, clusterware
Votedisk is very important for RAC (10g Clusterware, 11g GI). We call it an arbitration disk, when a node in the RAC cluster fails and the network is disconnected, it determines whether to kick it out of the Cluster to ensure normal operation of the cluster. When votedisk is damaged, as a result, cluster services cannot be started, cluster resources cannot be loaded, and a strike is triggered. Therefore, we usually need to pay attention to votedisk backup. In 11 GB, because votedisk and ocr will be put into the ASM disk group by default, you do not need to pay special attention, but for 10 Gb Cluster, because it cannot be placed in the ASM disk group and can only be used in raw format, pay special attention to votedisk and regularly back up it, such:
Use the dd command to back up and restore votedisk: Backup: dd if =/dev/raw/raw3 of =/tmp/votedisk. bak recovery: dd if =/tmp/votedisk. bak of =/dev/raw/raw3
Unfortunately, if you have not performed a backup and no image before, you can only re-create the crs when the votedisk is damaged. The following shows the process:
-- Disable crs and destroy the votedisk disk. Here is/dev/raw/raw3 [root @ rac1 ~]. # Dd if =/dev/zero of =/dev/raw/raw3 bs = 4096 count = 12800

If you restart crs again, you will be prompted that it cannot be started. Search for ocssd. log File discovery, which has records, indicates disk damage PS: 10g Clusterware log entry address is $ ORA_CRS_HOME/log/Host Name /...
[CSSD] 09:37:38. 327> USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996,209 4 Oracle. all rights reserved. [CSSD] 09:37:38. 327> USER: CSS daemon log for node rac1, number 1, in cluster [clsdmt] Listening to (ADDRESS = (PROTOCOL = ipc) (KEY = rac1DBG_CSSD )) [CSSD] 09:37:38. 332 [3059615952]> TRACE: clssscmain: local-only set to false [CSSD] 2015-0 1-16 09:37:38. 344 [3059615952]> TRACE: clssnmReadNodeInfo: added node 1 (rac1) to cluster [CSSD] 09:37:38. 352 [3059615952]> TRACE: clssnmReadNodeInfo: added node 2 (rac2) to cluster [CSSD] 09:37:38. 356 [3032808336]> TRACE: clssnm_skgxnmon: skgxn init failed, rc 1 [CSSD] 09:37:38. 356 [3059615952]> TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor [CSSD] 09:37:38. 362 [3059615952]> TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0 // dev/raw/raw3) [CSSD] 09:37:40. 381 [3032808336]> TRACE: clssnmvDiskOpen: Snapshot upt kill block on disk (0x09! = 0x636c73536b696c4c) [CSSD] 09:37:40. 381 [3032808336]> TRACE: clssnmDiskStateChange: state from 2 to 3 disk (0 // dev/raw/raw3)
It's easy to recreate crs. Execute two scripts:
1. $ ORA_CRS_HOME/install/rootdelete. sh
2. $ ORA_CRS_HOME/install/rootdeinstall. sh


Node 1: [root @ rac1 install] #. /rootdelete. shShutting down Oracle Cluster Ready Services (CRS): Stopping resources. error while stopping resources. possible cause: CRSD is down. stopping CSSD. unable to communicate with the CSS daemon. shutdown has begun. the daemons shoshould exit soon. checking to see if Oracle CRS stack is down... oracle CRS stack is not running. oracle CRS stack is down now. removing script for Oracle Cluster Ready servicesUpdating ocr file for downgradeCleaning up SCR settings in '/etc/oracle/scls_scr' [root @ rac1 install] #. /rootdeinstall. sh
Removing contents from OCR device2560 + 0 records in2560 + 0 records out10485760 bytes (10 MB) copied, 0.590608 seconds, 17.8 MB/s
Node 2: [root @ rac2 install] #. /rootdelete. shShutting down The Oracle Cluster Ready Services (CRS): OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid formatShutdown has begun. the daemons shoshould exit soon. checking to see if Oracle CRS stack is down... oracle CRS stack is not running. oracle CRS stack is down now. removing script for Oracle Cluster Ready servicesUpdating ocr file for downgradeCleaning up SCR settings in '/etc/oracle/scls_scr' [root @ rac2 install] #. /rootdeinstall. sh
Removing contents from OCR device2560 + 0 records in2560 + 0 records out10485760 bytes (10 MB) copied, 0.627909 seconds, 16.7 MB/s [root @ rac2 install] # dd if =/dev/zero of =/dev/raw/raw3 bs = 4096 count = 128000dd: writing '/dev/raw/raw3': No space left on device25601 + 0 records in25600 + 0 records out104857600 bytes (105 MB) copied, 5.40456 seconds, 19.4 MB/s
Then run $ ORA_CRS_HOME/root. sh on the two nodes in sequence. The OUI of the software does not need to be re-installed.

If the script cannot be deleted successfully and the crs is successfully installed, You can manually delete the following directories:
Rm/etc/oracle /*Rm-f/etc/init. d/init.css dRm-f/etc/init. d/init. crsRm-f/etc/init. d/init. crsdRm-f/etc/init. d/init. evmdRm-f/etc/rc2.d/K96init. crsRm-f/etc/rc2.d/S96init. crsRm-f/etc/rc3.d/K96init. crsRm-f/etc/rc3.d/S96init. crsRm-f/etc/rc5.d/K96init. crsRm-f/etc/rc5.d/S96init. crsRm-Rf/etc/oracle/scls_scrRm-f/etc/inittab. crsCp/etc/inittab. orig/etc/inittab
Summary:
We usually do multiple image redundancy for ocr and votedisk. In addition, if it is a bare device, we also use the dd command to back up it separately, which is usually not easy to damage or lose, in the case of no backup, the fault can only be solved by rebuilding the crs. This is the last lifeline of DBAs.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.