Composition of Oracle clusterware

Last Update:2018-12-07 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Oracle cluster is a separate installation package. After installation, Oracle clusterware on each node is automatically started. The operating environment of Oracle clusterware isTwo Disk Files (OCR, voting disk), composed of several processes and network elements.

Disk Files:

During running, clusterware requires two files: OCR and voting disk. These two files must be stored in shared storage.OCR is used to solve the forgetful problem, and voting disk is used to solve the Brain column problem.. Oracle recommends that you use bare devices to store these two files. Each file creates a bare device, and each bare device allocates about MB of space.

1.1 OCR

The forgetting problem is caused by the copy of configuration information on each node, and the configuration information of the modified node is not synchronized. The solution adopted by Oracle is to place the configuration file on the shared storage, which is OCR disk. The configuration information of the entire cluster is saved in OCR. The configuration information is saved as "key-value. Before Oracle 10 Gb, this file was called server manageability repository (srvm). At Oracle 10 Gb, this part was redesigned and renamed again OCR. During Oracle clusterware installationProgramYou are prompted to specify the OCR position. The location specified by the user is recorded in the/etc/Oracle/OCR. LOC (Linux system) or/var/opt/Oracle/OCR. LOC (Solaris System) file. In Oracle 9i RAC, The srvconfig. Loc file is equivalent. Oracle clusterware reads OCR content from the specified position at startup.

1.2 voting Disk

The voting disk file is mainly used to record the node member status. When split-brain occurs, it is determined that the partion gets control, and other partion must be removed from the cluster. You will also be prompted to specify this location when installing clusterware. After the installation is complete, run the following command to view the location of the voting Disk:$ Crsctl query CSS votedisk

Background process:

Clusterware is composed of several processes. The three most important processes are crsd, cssd, and evmd. at the final stage of clusterware installation, you are required to run the root command on each node. sh script. This script will add the three processes to the startup item at the end of the/etc/inittab file, so that clusterware will be automatically started each time the system starts, if two processes run abnormally, evmd and crsd, the system automatically restarts the two processes. If the cssd process is abnormal, the system restarts immediately.

 1 [Root @ node1 ~] # Ls -L/etc/init. d/init .* 2 -R-XR-x 1 Root 1951 Mar 24   05 : 30 /Etc/init. d/ Init. CRS  3 -R-XR-x 1 Root 4719 Mar 24   05 : 30 /Etc/init. d/init. crsd -- Crsd  4 -R-XR-x 1 Root 35399 Mar 24   05 : 30 /Etc/init. d/init.css d -- Cssd 5 -R-XR-x 1 Root 3195 Mar 24   05 : 30 /Etc/init. d/init. evmd -- Evmd  6 [Root @ node1 ~] #

2.1 cssd sub-process ocssd:

 1 [Oracle @ node1 bin] $ PS -Ef | Grep -V Grep |Grep  Cssd  2 Root 5673       1    0   23 : 15 ? 00 : 00 : 00 /Bin/ Sh /Etc/init. d/ Init.css D fatal  3 Root 6328   5673    0   23 : 19 ? 00 : 00 : 00 /Bin/ Sh /Etc/init. d/ Init.css D daemon  4 Root 6453    6328    0   23 : 19 ? 00 : 00 : 00 /Bin/ Su -L Oracle-C/bin/ Sh -C '  Ulimit-C unlimited; CD/opt/ora10g/product/10.2.0/crs_1/log/node1/cssd; /opt/ora10g/product/10.2.0/crs_1/bin/ocssd | exit $?  '  5 Oracle 6454    6453    0   23 :19 ? 00 : 00 : 00 /Bin/ Sh -C ulimit-C unlimited; CD/opt/ora10g/product/ 10.2 . 0 /Crs_1/log/node1/cssd;/opt/ora10g/product/ 10.2 . 0 /Crs_1/bin/ocssd | exit $? 6 Oracle 6489    6454    0  23 : 19 ? 00 : 00 : 00 /Opt/ora10g/product/ 10.2 . 0 /Crs_1/bin/ocssd. Bin

Ocssd is the most critical process of clusterware. If an exception occurs, the system restarts. This process provides CSS (cluster Synchronization Service)
Service. The CSS Service is responsible for configuring the entire cluster and determining which nodes are members of the cluster. CSS notifies other nodes to update the cluster configuration whenever a node joins or leaves the cluster. The CSS Service monitors the cluster status in real time through multiple heartbeat mechanisms and provides basic cluster services such as split-brain protection.

The CSS service has two heartbeat mechanisms: One is through the network heartbeat of the private network, and the other is through the disk heartbeat of the voting disk.

The two heartbeats have the maximum latency. For disk heartbeat, this latency is called IOT (I/O timeout). For Network heartbeat, this latency is called MC (misscount ). These two parameters are measured in seconds. The missing time-saving Iot parameter is greater than Mc. By default, these two parameters are automatically determined by Oracle and are not recommended to be adjusted.

 $ Crsctl get CSS disktimeout $ crsctl get CSS misscount

Note: Except for clusterware, this process is also required if ASM is used in a single-node environment. This process is used to support communication between ASM instance and RDBMS instance. If you install RAC on a node that uses ASM, you will encounter a problem: the RAC node requires only one ocssd process and should be running in the $ crs_home directory, in this case, you need to stop ASM and run $ ORACLE_HOME/bin/localcfig. sh Delete deletes the previous inittab entries. When installing ASM, use this script to start ocssd: $ ORACLE_HOME/bin/localconfig. Sh add

2.2 crsd process:

Crsd is the main process for achieving "ha". Its service is called the CRS (cluster ready Service) service.

Oracle clusterware is a component located at the cluster layer. It must provide "High Availability service" for Application Layer resources (CRS resource). Therefore, Oracle clusterware must monitor these resources, intervene when these resources run abnormally, including shutting down, restarting the process or transferring services. The crsd process provides these services.

All components that require high availability will be registered to OCR in the form of CRS resource during configuration installation. The crsd process determines which processes to monitor based on the content in OCR, how to monitor and solve problems. That is to say, the crsd process is responsible for monitoring the running status of CRS resources, including starting, stopping, monitoring, and Failover resources. By default, CRS will automatically restart the resource five times. If it still fails, it will stop trying.

CRS resource includes GSD (Global serveice Daemon), ONS (Oracle Notification Service), VIP, database, instance, and service. these resources are divided into two categories: GSD, ONS, VIP, and listener. Database, instance, and service belong to the database-related resource class.

This classification method is easy to understand: nodeapps means that only one node is needed. For example, each node has only one listener, while database-related resource means that these resources are related to the database and are not restricted by the node, for example, a node can have multiple instances, and each instance can have multiple services.

2.3 evmd process:

The evmd process is responsible for releasing events generated by CRS ). these events can be published to customers in two ways: ONS and callout script. you can customize the callback script and place it in a specific directory. In this way, when some events occur, evmd automatically scans the Directory and calls the script, this call is done through the racgevt process.

In addition to complex release events, the evmd process serves as a bridge between the crsd and cssd processes. The communication between the CRS and CSS services is completed through the evmd process.

Network elements-principles and features of VIP

3.1 VIP principle:

Oracle's TAF is built on the VIP technology. What is the difference between an IP address and a VIP address is that the IP address uses the TCP layer timeout and the VIP uses the application layer instant response. VIP is a floating IP address. When a node encounters a problem, it is automatically transferred to another node. Suppose there is a RAC with two nodes, and each node has a VIP during normal operation. When Node 2 fails, RAC performs the following operations:

1). After detecting an exception on the rac2 node, CRS will trigger clusterware reconstruction, and finally remove the rac2 node from the cluster. Node 1 will form a new cluster. 2 ). RAC's failover Mechanism transfers the VIP address of Node 2 to node 1. At this time, the public Nic of Node 1 has three IP addresses: VIP1, VIP2, public ip1.3 ). your connection request to VIP2 will be routed to node 14 by the IP layer ). because VIP2 is available on node 1, all data packets pass through the routing layer, network layer, and transmission layer. 5). However, only VIP1 and public IP1 IP addresses are monitored on node 1. VIP2 is not monitored, so no program at the application layer receives this packet, and this error is immediately captured. 6) The customer segment can receive this error immediately, and then the customer segment will re-initiate a connection request to vip1.

3.2 VIP features:

 1). VIP is created through vipca Script 2). VIP is registered to OCR as nodeapps CRS resource and maintained by CRS. 3). The VIP address is bound to the public Nic of the node. Therefore, the public Nic has two addresses. 4) When a node fails, CRS transfers the VIP address of the faulty node to another node. 5). the listener of each node listens to the public IP and vip6 on the public nic at the same time). The tnsnames. ora of the client is usually configured to point to the VIP of the node.

Terware's log system:

Preferred file: Alert. Log $ crs_home/Log/[node]/Alert. Log clusterware background process log: crsd. Log $ crs_home/Log/[node]/crsd/Crsd. logocssd. Log $ crs_home/Log/[node]/cssd/Ocsd. logevmd. Log $ crs_home/Log/[node]/evmd/Evmd. Log nodeapp log location: $ crs_home/Log/[node]/racg /--Including ons and VIP, such as: ora. rac1.ons. Log tool execution log: $ crs_home/Log/[node]/client/-- logs generated by ocrcheck, ocrconfig, ocrdump, oifcfg, and clscfg are stored in this directory.

-- From Oracle RAC

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More