In 11.2, the CRSD process is no longer one of the most critical processes in RAC.
If you are familiar with 10g RAC, you should be aware of the importance of the CRSD process, and Oracle starts the process and then launches the entire cluster and database after the operating system starts.
In the 11.2 RAC, Oracle adjusts the ASM so that OCR and Vot can be stored in the ASM disk group. ASM is a component supported by cluster, and the OCR and VOT required for cluster to boot are in ASM, which in fact solves the problem of having a chicken or an egg first. Eventually Oracle solved the problem through the OHASD process, while the architecture of the entire cluster and ASM changed significantly, and the OHASD process replaced the CRSD process as the most critical process in the RAC environment.
The importance of the CRSD process has been incredibly low, and the first two days found in a customer's 11.2 RAC environment that even if a node's CRSD process does not start, the database can still be started manually and the database can be accessed normally.
The cause of the problem should be an error in the disk group where the OCR and vot are accessed on node 2, causing CRSD to automatically exit after several attempts to acquire the information stored in OCR, thus making node 2 fail to start properly. However, on Node 2, in addition to the CRSD process, the other cluster process has been fully started, ASM instances can also be started, you can manually start the database on Node 2.
The error message on the ASM Alert on Node 2 is as follows:
Tue 18 14:09:18 2011
Note:client +asm2:+asm Registered, Osid 13113, MBR 0x0
Errors in FILE/U01/APP/GRID/DIAG/ASM/+ASM/+ASM2/TRACE/+ASM2_ORA_13108.TRC:
Ora-15180:could not open dynamic library ASM library-generic Linux, version 2.0.4 (KABI_V2), error [Open]
Error:error ORA-15180 caught in ASM I/O path
Errors in FILE/U01/APP/GRID/DIAG/ASM/+ASM/+ASM2/TRACE/+ASM2_ORA_13108.TRC:
Ora-15081:failed to submit a I/O operation to a disk
warning:failed to online diskgroup resource Ora. DATADG.DG (Unable to communicate with CRSD/OHASD)
Tue 18 14:09:19 2011
Note: [Crsd.bin@findb2 (TNS v1-v3) 13121] opening OCR file
Errors in FILE/U01/APP/GRID/DIAG/ASM/+ASM/+ASM2/TRACE/+ASM2_ORA_13130.TRC:
Ora-15180:could not open dynamic library ASM library-generic Linux, version 2.0.4 (KABI_V2), error [Open]
Error:error ORA-15180 caught in ASM I/O path
This paper url:http://www.bianceng.cn/database/oracle/201410/45638.htm
Errors in FILE/U01/APP/GRID/DIAG/ASM/+ASM/+ASM2/TRACE/+ASM2_ORA_13130.TRC:
Ora-15081:failed to submit a I/O operation to a disk
Tue 18 14:09:20 2011
warning:failed to online diskgroup resource Ora. FRADG.DG (Unable to communicate with CRSD/OHASD)
Tue 18 14:09:21 2011
Note: [Crsd.bin@findb2 (TNS v1-v3) 13134] opening OCR file
Errors in FILE/U01/APP/GRID/DIAG/ASM/+ASM/+ASM2/TRACE/+ASM2_ORA_13143.TRC:
Ora-15180:could not open dynamic library ASM library-generic Linux, version 2.0.4 (KABI_V2), error [Open]
Error:error ORA-15180 caught in ASM I/O path
Errors in FILE/U01/APP/GRID/DIAG/ASM/+ASM/+ASM2/TRACE/+ASM2_ORA_13143.TRC:
Ora-15081:failed to submit a I/O operation to a disk
This is the reason why the OCRD process is causing an error and exiting. Database can be opened normally, node 2 on the database and monitoring can not automatically start, VIP also has problems. Also, tools that require OCR information on Node 2 are not available, such as Ocrconfig, Ocrcheck, and Srvctl.
Of course, generally speaking, this problem is not acceptable, and the problem is solved by rebuilding the RAC Environment eventually. But this case also shows how much of the cluster structure has changed in the 10g and 11g.