Changes in 11g in the CRSD process
In 11.2, the CRSD process is no longer one of the most critical processes in RAC.
If you are familiar with 10g RAC, you should be aware of the importance of the CRSD process, when Oracle starts this process and then launches the entire cluster and database after the operating system is started.
In a 11.2 RAC, Oracle adjusts ASM so that OCR and Vot can be stored in ASM disk groups. ASM is a component supported by cluster, and the OCR and VOT required for cluster startup are put in ASM, which actually solves the problem of a chicken or egg first. Ultimately, Oracle solves this problem through the OHASD process, and the architecture of the entire cluster and ASM has changed significantly, and the OHASD process has replaced the CRSD process as the most critical process in the RAC environment.
The importance of the CRSD process has been so low that it was found in a customer's 11.2 RAC environment two days ago that even if a node's CRSD process does not start, the database can still be started manually and the database can be accessed normally.
The cause of the problem is that the access to OCR on Node 2 and the VOT Disk group error occurred, causing CRSD to try to obtain the information stored in OCR failed to automatically exit, so that Node 2 does not start properly. However, in addition to the CRSD Process on Node 2, the other cluster process has been fully started, ASM instances can also be started, you can manually start the database on Node 2.
On Node 2, ASM's alert has the following error message:
tue jun 13 10:59:17 2017reconfiguration started (old inc 10, new  INC 12) list of instances: 1 2 (myinst: 1) Global resource directory frozen communication channels reestablished master broadcasted resource hash value bitmaps non-local process blocks Cleaned outtue jun 13 10:59:17 2017 lms 0: 0 gcs shadows cancelled, 0 closed, 0 xw survived set master node info submitted all remote-enqueue requests dwn-cvts replayed, valblks dubious All grantable enqueues granted Submitted all GCS Remote-cache requests fix write in gcs resourcesreconfiguration completetue jun 13 11:03:01 2017ipc Send timeout detected. Sender: ospid 3173 [[email protected] ( PING)]receiver: inst 2 binc 429480538 ospid 3190tue jun 13 12:12:38 2017note: [[email protected] (TNS V1-V3)  21461] OPENING OCR fileTue Jun 13 12:12:38 2017NOTE: [[email protected] (TNS V1-V3) 21461] opening OCR fileTue Jun 13 13:38:34 2017MEMORY_TARGET Defaulting to 1128267776.* instance_number obtained from css = 1, checking for the existence of node 0... * node 0 does not exist. instance_number = 1 Starting ORACLE instance (Normal) Tue jun 13 13:42:20 2017warning: waited 15 secs for write io to pst disk&nBsp;0 in group 1.warning: waited 15 secs for write io to PST disk 0 in group 1.WARNING: Waited 15 secs for write io to pst disk 0 in group 2.warning: waited 15 secs for write io to pst disk 0 in group 2.warning: Waited 15 secs for write io to pst disk 0 in group 3.warning: waited 15 secs for write io to pst disk 0 in group 3.warning: waited 15 secs for write io to pst disk 0 in group 4.warning: waited 15 secs for write io to pst disk 0 in group 4.warning: waited 15 secs for write io to Pst disk 0 in group 5.warning: waited 15 secs for write io to pst disk 0 in group 5.warning: waited 15 secs for write io to pst disk 0 in group 6.warning: waited 15 secs for write io to pst disk 0 in group 6.
This should be the cause of the OCRD process error and exit. The database can be opened normally, the database and monitoring on Node 2 cannot start automatically, the VIP also has the problem. In addition, the tools that require OCR information on Node 2 are not available, such as Ocrconfig, Ocrcheck, and Srvctl.
There are still no solutions, such as having met friends can
CRSD cannot start on Node 2, database and listener cannot start automatically, such as Ocrconfig, Ocrcheck, and SRVCT