Oracle RAC Cluster architecture

Last Update:2014-05-23 Source: Internet

Author: User

Tags failover joins

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. Oracle CLUSTER architecture

Oracle RAC, the full name of Oracle Real application Cluster, a true application cluster, is a parallel cluster system provided by Oracle with the entire cluster system from Oracle Clusterware (cluster ready software) and real Application Clusters (RAC) consists of two main parts. The essence of Oracle RAC is that Oracle instance nodes located on different operating systems simultaneously access the same Oracle database, each node communicates through a private network, monitoring the running state of the nodes, all data files of the Oracle database, online log files, The control files are placed on the shared storage device of the cluster, while the shared storage device can be raw, ASM, OCFS2, etc., all the cluster nodes can read and write the shared storage simultaneously. The basic topology for Oracle RAC is as follows:

The topology shows that an Oracle RAC database consists of multiple server nodes, each with its own independent OS, Clusterware, Oracle RAC Database program, and each node has its own network listener. Clusterware is a cluster software, mainly used for cluster system management, the Oracle RAC Database program is used to provide Oracle instance process for clients to access the cluster system, monitoring services are mainly used to monitor their own network port information, All services and programs go through the operating system to access a shared storage, which ultimately completes the reading and writing of the data. Shared storage can be implemented in many ways by using automated storage Management (ASM), Oracle Cluster file System (OCFS), Bare-device (RAW), network-area storage (NAS), and so on to ensure consistent data across the entire cluster system. From oracle10g, Oracle has provided its own clustering software, Oracle Clusterware, which is implemented through CRS (i.e., cluster ready Services), which is the prerequisite for installing Oracle RAC. is also the basis for a stable RAC environment, prior to the oracle10g version, the installation of RAC must be with third-party cluster software, and after oracle10g, when installing Oracle RAC, you can use Oracle's own cluster software, You can also use a RAC-certified third-party cluster software instead. From the operational mechanism of Oracle, each server in the cluster is an Oracle instance, and multiple Oracle instances correspond to the same Oracle database, forming the Oracle DB cluster. Please see:

As you can see, a DB instance running on two nodes accesses the same RAC database, and the two-node local disk is used only for Oracle installers and Clusterware software, while on shared storage, Oracle's data files, control files, online log files, Archive log files, etc., which is a way to allocate data storage when installing an Oracle RAC, in fact, RAC provides a variety of data storage methods, which are described separately below. Oracle Clusterware Architecture and process Introduction 2.1, Oracle Clusterware Introduction cluster ready Services, hereinafter referred to as CRS, is a cluster software developed by Oracle, similar to other cluster software, CRS mainly complete cluster member management, heartbeat monitoring, failover and other functions, CRS requires each cluster node operating system must be the same, so that through the CRS to bind the operating system of multiple nodes together, client access to the cluster, Just like accessing a server. The CRS consists of two cluster packages, voting disk and Oracle Cluster Registry, respectively. Voting disk, which is the voting disks, each node in the cluster periodically evaluates its own health, and then puts its status information on the voting disk. The nodes also view their running state and pass the information to the other nodes and write to the voting disk. When a cluster node fails, voting disks can also be used to vote on the quorum, so the voting disk must be placed on a shared storage device. To ensure that each node has access to it. The voting disk can be either a bare disk partition or a separate file. Since it only records node run information, the disk size is generally around 10-20m. Oracle Cluster Registry, referred to as OCR, the cluster registration service, is used primarily to record configuration information for clusters and databases in the RAC. This information includes a list of cluster nodes, cluster DB instance-to-node mappings, and CRS application resource information. CRS uses two heartbeat devices to verify the state of the node members and ensure the integrity of the cluster: one is a voting disk, and the cluster synchronization service process writes a heartbeat message to the voting disk every few seconds, and the cluster can verify the state of the node by voting the disk. If a node does not write information to the voting disk within the specified maximum time period, the cluster considers the node to be invalid and then performs a failover. Another heartbeat is the heartbeat of a private Ethernet node between nodes, which can be used to determine if a network failure occurs between nodes. The combination of two kinds of heartbeat mechanism effectively increases the reliability of the cluster. In addition, CRS recommends that the private Ethernet heartbeat used for internal communication must be associated with the RAC sectionPoint Communication Network is separate, that is, cannot be in the same network, if the network between the RAC node communication and the private Ethernet heartbeat in the same network, then you must ensure that the network can not be accessed by the nodes of the non-clustered system. 2.2, Oracle Clusterware Process Introduction Oracle Clusterware through cluster ready services to complete the cluster function, CRS contains a set of mutually collaborative background processes, Below is a detailed description of the CRS in a few important background processes. 1Cluster Synchronization Services, called CSS, is used to manage and coordinate the relationships between nodes in a cluster, and is used for inter-node communication, and the cluster is notified by CSS when the node joins or leaves the cluster. The corresponding background process for CSS in the cluster is CSSD, which is run and managed by the Oracle user. When a node fails, the CSSD automatically restarts the operating system. 2Cluster Ready Services, CRS, is the main program for managing high-availability operations within a cluster, and CRS manages all resources in the cluster, including databases, services, instances, VIP addresses, listeners, application processes, and so on, and the corresponding background process for CRS in the cluster is CRSD, The process can start, stop, monitor and fault tolerance of the cluster resources, in normal state, CRSD monitoring node various resources, when a resource is abnormal, automatically restart or switch the resource. 3Process Monitor Daemon is abbreviated as OPROCD, which is locked in memory for monitoring clusters and providing I/O protection (I/Os fencing). OPROCD runs on each node and performs regular health checks, and if it is not able to communicate with a node beyond the desired interval, the OPROCD resets the processor and restarts the node. A OPROCD failure will also cause Clusterware to restart the node. The 4Oracle Notification service is referred to as ONS, the Oracle advertising services, primarily for publishing and subscribing to fast application Notification events. The 5Event management abbreviation EVM is a background process for event detection, run and managed by Oracle users. RAC database system mechanism and process 3.1, RAC Introduction RAC is a clustered database with a shared cache architecture that overcomes the limitations of traditional non-shared and shared disk methods. Provides a scalable and usable database solution for all business applications, typically in conjunction with Oracle Clusterware or third-party cluster software OraclE cluster system. RAC is a fully shared architecture in which all data files, control files, online log files, parameter files, and so on must be stored on the shared disk, because only then can all nodes of the cluster be accessed, and RAC supports multiple storage methods. You can use any of the following methods: (1) Bare devices (Raw devices) that is, without the file system, writing data directly to disk, the advantage is that disk I/O performance is high, suitable for write operations frequently business system, but the disadvantage is also obvious: data maintenance and backup is inconvenient, Backups can only be done with the DD command or a block-level backup device, which undoubtedly increases maintenance costs. (2) Cluster file system in order to support shared storage, Oracle has developed a clustered file system OCFs, which can be used for Windows, Linux, and Solaris, and has now evolved to OCFS2, through the OCFS2 file system, Multiple cluster nodes can read and write one disk at a time without destroying the data, but for a large number of read and write business systems, performance is not very high. In addition, Oracle RAC supports third-party clustered file systems, such as Redhat's GFS. (3) Network File System (NFS) (4) Automated Storage management automated Storage Management, referred to as ASM, is the recommended shared data storage method for Oracle, It is an attribute contained in Oracle database 10g. ASM is the raw way to store data, but it joins the data management function, which avoids the I/O consumption resulting from the file system by writing data directly to disk. As a result, using ASM makes it easy to manage shared data and provide the performance of asynchronous I/O. ASM can also optimize performance by assigning I/O loads, eliminating the need to manually adjust I/O. 3.2, Oracle RAC features a high-performance, high-reliability database cluster system that can be built with the RAC database, the advantages of RAC are: (1) can realize load balancing between multiple nodes RAC DB cluster can be based on the scheduling policy set, Load balancing between cluster nodes, so that each node of the RAC database is working and is also monitored, and when a node fails, the RAC cluster automatically isolates the failed node from the cluster and automatically transfers the failed node's request to the other healthy nodes for a transparent service switch. (2) can provide highly available services This is the Oracle Clusterware implementation of the function, through the CRS can achieve node state monitoring, failure transparent transfer, which ensures that the Oracle database can provide uninterrupted service. (3) Increased number of concurrent connections by horizontal scaling RAC This advantage is ideal for large online transactionsThe system. (4) Improved transaction response time through parallel execution technology This is a major advantage of the RAC cluster and is typically used in data sharing systems. (5) Good scalability when the cluster system can not meet the busy business system, the RAC database is easy to add cluster nodes, and can be added online, and automatically join the cluster system, there is no downtime, and when a cluster node is not needed, it is very simple to delete the node. RAC database also has a certain disadvantage: (1) compared with the single-machine database, management and maintenance more complex, and the maintenance personnel requirements higher (2) low-level planning design is not good, the overall performance of the system will be poor, or even less than the performance of a single system. Therefore, if you are not familiar with the RAC database, it is not recommended to use it immediately in a production environment. (3) Since the RAC cluster system requires multiple nodes, multiple servers need to be purchased and Oracle Enterprise version database is required, which virtually increases the cost of hardware and software. 3.3, RAC Process Management RAC database is composed of multiple nodes, each node is a database instance, each instance has its own background process and memory structure, and in the RAC cluster, each instance of the background process and memory structure are the same, from the overall look, Like a single database mirror, however, the RAC database is structurally different from a single-instance library: (1) Each instance of the RAC database has at least one additional redo thread (Redo thread) (2) Each instance of the RAC database has its own undo tablespace (undo Tablespace) It is clear that this mechanism is independent of each instance using its own redo thread and undo table space, each locking its own modified data. This design method of rac separates the operations of multiple instances relatively independently. So how does the RAC database achieve the consistency of the node data, in fact, each RAC instance has a buffer cache (buffer) within the SGA, through the cache fusion technology, RAC synchronization between the various nodes in the SGA cache information, Thus, the consistency of node data is ensured, and the access speed of the cluster is improved. The most important feature of RAC database is sharing, so how to realize the orderly data sharing of multiple nodes, this is the two processes of RAC: the Global Cache Service (GCS) and the Global Enqueue Service (GES) The global cache Service (GCS) and Global Queue Service (GES) are the most basic RAC processes and are primarily used to coordinate simultaneous access to shared databases and shared resources within a database. At the same time, Ges and GCs record and maintain through the use of Global resource directory (Globals Resource DIRECTORY,GRD)Status information for each data file, while GRD is stored in memory and the content distribution is on all instances. Each instance manages part of the content. A combination of several special processes and GRD in the RAC enables RAC to use cache fusion technology, which is:? The Global Cache Service Processes (LMSN) LMS process is primarily used to manage access to data blocks within the cluster and to transmit block mirroring in the buffer Cache for different instances. ? The Global Enqueue Service Monitor (Lmon) Lmon primarily monitors resource interactions within the cluster, and manages instances and handling exceptions, as well as recovery operations for cluster queues. ? The Global Enqueue Service Daemon (LMD) LMD process primarily manages access to global queues and global resources, updates the status of the corresponding queues, and processes resource requests from other instances. ? The Lock Processes (LCK) LCK process is primarily used to manage inter-instance resource requests and cross-instance invocation operations, and manages resource requests other than cache fusion, such as requests from the library and row caches. ? The diagnosability Daemon (DIAG) DIAG process is primarily used to capture diagnostic information for failed processes in the instance and generate the corresponding trace files. 3.4, RAC database storage planning the software involved in installing the RAC database is Oracle clusterware, Oracle RAC Database software, and also involves voting disk, OCR, etc. About the amount of disk space that is required for each section as follows: after understanding the amount of disk space required for each part of the RAC, you can plan the data storage for each part of the use. RAC supports a wide variety of data storage methods, such as single log file system EXT2/EXT3, clustered file system Ocfs2/gfs, Network File system NFS, bare device raw, automated storage management ASM, etc., and the following table lists the types of storage that can be used:

The specific storage policy to use differs depending on the installation of the RAC environment. Three commonly used storage methods are recommended here:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More