Remember once Oracle Clusterware successfully installed fault handling

Source: Internet
Author: User
Tags semaphore

Remember once Oracle Clusterware installation failure handling

1. Environment

cat /etc/5.8  (tikanga) Kernel \ r on an \m

2. Description of the problem during the installation of the RAC, the nodes were shut down after successfully installing the grid (Clusterware). After the next turn on each node, check the CRS resource status with the following error:

~]$ crs_stat-t-VCRs-0184: Cannot communicate with the CRS daemon.

3. Analysis and Resolution

Check CRS Status:

[Email protected] ~]$ crsctl Check Crscrs-4638: Oracle High Availability Services is ONLINECRS-4535: Cannot communicate with Cluster ready Services  # Unable to communicate with CRS CRS-4529: Cluster synchronization Services is Onlinecrs-4533: Event Manager is online

To view CRSD corresponding logs:

 the- One- +  the: -:13.490: [gipcxcpt][1002185440] Gipcshutdownf:skipping shutdown, count2, from [clsinet.c:1732], ret gipcretsuccess (0) the- One- +  the: -:13.492: [gipcxcpt][1002185440] Gipcshutdownf:skipping shutdown, count1, from [CLSGPNP0.C:1021], ret gipcretsuccess (0) the- One- +  the: -:13.498: [ocrasm][1002185440]proprasmo:errorinchOpen/createfile inchDG [DATA] # failed to open Disk Group [ocrasm][1002185440]slos:slos:Cat=7, opn=kgfoal06, dep=15077, loc=Kgfokgeora-15077: Could notLocateASM Instance serving a required DiskGroup # no ASM instances the- One- +  the: -:13.498: [ocrasm][1002185440]proprasmo:kgfocheckmount returned [7] the- One- +  the: -:13.498: [ocrasm][1002185440]proprasmo:the ASM instance is down # ASM instance is off the- One- +  the: -:13.499: [ocrraw][1002185440]proprioo:failed to open [+data]. Returned Proprasmo () with [ -]. Marking location as unavailable. the- One- +  the: -:13.499: [ocrraw][1002185440]proprioo:no ocr/OLR devices is usable the- One- +  the: -:13.499: [ocrasm][1002185440]proprasmcl:asmhandle is NULL the- One- +  the: -:13.499: [ocrraw][1002185440]proprinit:could not open raw device the- One- +  the: -:13.499: [ocrasm][1002185440]proprasmcl:asmhandle is NULL the- One- +  the: -:13.499: [ocrapi][1002185440]a_init: -!: Backend init unsuccessful: [ -] the- One- +  the: -:13.499: [crsocr][1002185440] OCR context init failure. error:proc- -: Error whileAccessing the physical storage ASM error [SLOS:Cat=7, opn=kgfoal06, dep=15077, loc=Kgfokgeora-15077: Could notLocateASM instance serving a required diskgroup] [7] the- One- +  the: -:13.499: [crsd][1002185440][panic] CRSD exiting:could not init OCR, code: - the- One- +  the: -:13.499: [crsd][1002185440] Done.

Log information indicates that the ASM instance failed to start, causing the CRSD process to fail to start

Try to start the ASM instance manually:

[Email protected] ~]$ asmcmdconnected to an idle instance. Asmcmd> startupora-27154: post/wait  create Failedora-27300  - ORA-27301: OS failure message:no spaceleft on Deviceora-27302: Failure occurred at:sskgpsemsperconnected to an idle instance.

The above information indicates that the failed operation is semget.
Semget's task is to get a semaphore set (get set of semaphores), where the no space left on device does not mean storage space, but a semaphore resource.

Check the semaphore usage in the system:

[[Email protected] ~]$ IPCS------Shared Memory Segments--------Key shmid owner perms bytes nattch status0x00000000 3407873Root644         the         2                       0x00000000 3440643Root644        16384      2                       0x00000000 3473412Root644        280        2------Semaphore Arrays--------Key Semid owner perms Nsems------Message Queues--------Key msqid owner Perms used-bytes messages

No exception was found. Continue checking the semmns in the kernel parameters:

[Email protected] ~]# sysctl-a| grep  the            

The four parameters are:
SEMMSL---The number of signals each signal set contains, which should be about 10 larger than the maximum number of Oracle processes
Number of signals in the semmns---system
SEMOPM---Maximum number of operations per signal operation call
Semmni---The number of signal set identifiers to control the number of signal sets that can be created at any time

Increase the signal volume in the System (/ETC/SYSCTL.CONF):

 the 32768  - 228

To restart an ASM instance:

asmcmd> Startupora-03113: end-of-file on communication channelconnected to an idle instance.

Because anxious to continue to do the experiment, at this time directly to two nodes restarted, after restarting the ASM instance Normal startup, CRS resource status is normal, the problem is resolved.

Later, after the end of the experiment query ORA-03113, the possible causes of this error are:

1) Unix core parameter set incorrectly 2) Oracle Execute File permissions incorrect/environment variable problem 3) client communication does not handle correctly 4) database server crash/OS crash/process killed 5) Oracle Internal Error 6) a specific SQL, PL + + error 7) space Not enough 8) firewall issues

But because the error environment has disappeared, failed to troubleshoot, it is regrettable, only to stay for future reference.

4. Reference
1) [Oracle 11g RAC crs-4535/ora-15077] http://blog.csdn.net/l106439814/article/details/8969060
2) [ASM start Error ORA-27300, ORA-27301 and ora-27302:failure occurred at:sskgpsemsper] http://www.51itstudy.com/ 33735.html
3) [DBA notes: Handling of Shared memory not being released properly] http://www.eygle.com/archives/2011/03/ipcs_semaphore.html
4) [Ora-03113:end-of-file on communication channel error locating process] http://www.51itstudy.com/6628.html

Remember once Oracle Clusterware successfully installed fault handling

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.