Oracle RAC Cluster architecture

Source: Internet
Author: User
Tags failover

I. Oracle CLUSTER architecture

Oracle RAC, the full name of Oracle Real application Cluster, a true application cluster, is a parallel cluster system provided by Oracle with the entire cluster system made up of Oracle Clusterware (cluster-ready software) and Real application Clusters (RAC) consists of two main parts. The essence of Oracle RAC is that Oracle instance nodes located on different operating systems simultaneously access the same Oracle database, each node communicates through a private network, monitoring the running state of the nodes, all data files of the Oracle database, online log files, The control files are placed on the shared storage device of the cluster, while the shared storage device can be raw, ASM, OCFS2, etc., all the cluster nodes can read and write the shared storage simultaneously. The basic topology for Oracle RAC is as follows:The topological structure shows that:an Oracle RAC database consists of multiple server nodes, each with its own independent OS, Clusterware, Oracle RAC Database program, and each node has its own network listener. Clusterware is a cluster software, mainly used for cluster system management, the Oracle RAC Database program is used to provide Oracle instance process for clients to access the cluster system, monitoring services are mainly used to monitor their own network port information, All services and programs go through the operating system to access a shared storage, which ultimately completes the reading and writing of the data. Shared storage can be implemented in many ways by using automated storage Management (ASM), Oracle Cluster file System (OCFS), Bare-device (RAW), network-area storage (NAS), and so on to ensure consistent data across the entire cluster system. from oracle10g, Oracle has provided its own clustering software, Oracle Clusterware, which is implemented through CRS (i.e., cluster ready Services), which is the prerequisite for installing Oracle RAC. is also the basis for a stable RAC environment, prior to the oracle10g version, the installation of RAC must be with third-party cluster software, and after oracle10g, when installing Oracle RAC, you can use Oracle's own cluster software, You can also use a RAC-certified third-party cluster software instead. from the operational mechanism of Oracle, each server in the cluster is an Oracle instance, and multiple Oracle instances correspond to the same Oracle database, forming the Oracle DB cluster. Please see:As you can see, a DB instance running on two nodes accesses the same RAC database, and the two-node local disk is used only for Oracle installers and Clusterware software, while on shared storage, Oracle's data files, control files, online log files, Archive log files, etc., which is a way to allocate data storage when installing an Oracle RAC, in fact, RAC provides a variety of data storage methods, which are described separately below. Ii. Introduction to Oracle clusterware architecture and Process2.1. Introduction to Oracle ClusterwareCluster Ready Services, hereinafter referred to as CRS, is a cluster software developed by Oracle, similar to other cluster software, CRS mainly complete cluster member management, heartbeat monitoring, failover and other functions, CRS requires each cluster node operating system must be the same, so that By using CRS to bind the operating systems of multiple nodes together, clients access the cluster as if they were accessing a single server. The CRS consists of two cluster packages, voting disk and Oracle Cluster Registry, respectively. Voting disk, which is the voting disks, each node in the cluster periodically evaluates its own health, and then puts its status information on the voting disk. The nodes also view their running state and pass the information to the other nodes and write to the voting disk. When a cluster node fails, voting disks can also be used to vote on the quorum, so the voting disk must be placed on a shared storage device. To ensure that each node has access to it. The voting disk can be either a bare disk partition or a separate file. Since it only records node run information, the disk size is generally around 10-20m. Oracle Cluster Registry, referred to as OCR, the cluster registration service, is used primarily to record configuration information for clusters and databases in the RAC. This information includes a list of cluster nodes, cluster DB instance-to-node mappings, and CRS application resource information. CRS uses two heartbeat devices to verify the state of the node members and ensure the integrity of the cluster: one is a voting disk, and the cluster synchronization service process writes a heartbeat message to the voting disk every few seconds, and the cluster can verify the state of the node by voting the disk. If a node does not write information to the voting disk within the specified maximum time period, the cluster considers the node to be invalid and then performs a failover. Another heartbeat is the heartbeat of a private Ethernet node between nodes, which can be used to determine if a network failure occurs between nodes. The combination of two kinds of heartbeat mechanism effectively increases the reliability of the cluster. In addition, CRS recommends that the private Ethernet heartbeat for internal communications be separate from the network used for communication between RAC nodes, that is, in the same network, if the network between the RAC nodes communicates with the private Ethernet heartbeat within the same network, then it must be ensured that the network cannot be accessed by the nodes of the non-clustered system. 2.2. Introduction to Oracle Clusterware processOracle Clusterware through the cluster ready services to complete the cluster function, CRS contains a set of mutually collaborative background processes, the following detailed introduction of the CRS in a few important background processes. 1Cluster Synchronization Servicesreferred to as CSS, used to manage and coordinate the relationship between nodes in the cluster, and for inter-node communication, when the node joins or leaves the cluster, the CSS is notified by the cluster. The corresponding background process for CSS in the cluster is CSSD, which is run and managed by the Oracle user. When a node fails, the CSSD automatically restarts the operating system. 2Cluster Ready ServicesCRS, is the main program to manage high-availability operations in the cluster, CRS manages all resources in the cluster, including databases, services, instances, VIP addresses, listeners, application processes, etc., CRS in the cluster corresponding to the background process is CRSD, the process can be the cluster resources to start, stop, Monitoring and fault-tolerant operations, normal state, CRSD monitoring node various resources, when a resource is abnormal, automatically restart or switch the resource. 3Process Monitor Daemonreferred to as OPROCD, this process is locked in memory for monitoring the cluster and for providing I/O protection (I/O fencing). OPROCD runs on each node and performs regular health checks, and if it is not able to communicate with a node beyond the desired interval, the OPROCD resets the processor and restarts the node. A OPROCD failure will also cause Clusterware to restart the node. 4Oracle Notification Servicecalled ONS, the Oracle Notification Service, primarily for publishing and subscribing to the Fast Application notification event. 5Event ManagementEVM is a background process for event detection, run and managed by Oracle users. third, RAC database system institutions and processes3.1. Introduction to RACRAC is a clustered database with a shared cache architecture that overcomes the limitations of traditional non-shared and shared disk methods, providing a scalable and usable database solution for all business applications, typically with Oracle Clusterware or third-party cluster software together to form an Oracle cluster system. RAC is a fully-shared architecture in which all data files, control files, online log files, parameter files, and so on, must be stored on the shared disk, because only then can all nodes of the cluster be accessed, and RAC supports multiple storage methods, either of the following ways:(1) Bare device (Raw devices)That is, without going through the file system, writing data directly to disk, the advantage is that disk I/O performance is high, suitable for business systems with frequent write operations, but the disadvantage is obvious: data maintenance and backup is not convenient, backup can only be done by DD command or block-level backup device, This undoubtedly increases the maintenance cost. (2) Cluster file systemto support shared storage, Oracle has developed a clustered file system OCFs, which is available for Windows, Linux, and Solaris, and has now evolved to OCFS2, through the OCFS2 file system, Multiple cluster nodes can read and write one disk at a time without destroying the data, but for a large number of read and write business systems, performance is not very high. In addition, Oracle RAC supports third-party clustered file systems, such as Redhat's GFS. (3) Network File System (NFS) (4) Automated Storage Managementautomated Storage Management, referred to as ASM, is the recommended shared data storage method for Oracle, which is an attribute included in Oracle database 10g. ASM is the raw way to store data, but it joins the data management function, which avoids the I/O consumption resulting from the file system by writing data directly to disk. As a result, using ASM makes it easy to manage shared data and provide the performance of asynchronous I/O. ASM can also optimize performance by assigning I/O loads, eliminating the need to manually adjust I/O. 3.2, the characteristics of the Oracle RAC through the RAC database, can build a high-performance, high-reliability Database cluster system, the advantage of RAC is: (1) can achieve the load balance between multiple nodesThe RAC database cluster can be load balanced among the cluster nodes according to the scheduling policy, so that each node of the RAC database is working and is in mutual monitoring state, when a node fails, the RAC cluster automatically isolates the failed node from the cluster. And the request of the failed node is automatically transferred to the other healthy nodes, which realizes the service transparent switch. (2) can provide high-availability servicesThis is the Oracle Clusterware implementation of the function, through the CRS can achieve node state monitoring, failure transparent transfer, which ensures that the Oracle database can provide uninterrupted service. (3) Increased number of concurrent connections through scale-outThe advantage of RAC is well suited for large online transaction systems. (4) Improved transaction response time through parallel execution techniquesThis is a major advantage of the RAC cluster and is typically used in data sharing systems. (5) Very good extensibilitywhen the cluster system is unable to meet the busy business system, the RAC database can easily add cluster nodes, add the nodes online, and automatically join the cluster system, there is no downtime, and it is very simple to delete nodes when a cluster node is not needed. The RAC database also has some drawbacks:(1) Management and maintenance are more complex and require higher maintenance personnel than stand-alone database(2) low-level planning and design is not good, the overall performance of the system will be poor, or even less than the performance of a single system. Therefore, if you are not familiar with the RAC database, it is not recommended to use it immediately in a production environment. (3) Since the RAC cluster system requires multiple nodes, multiple servers need to be purchased and Oracle Enterprise version database is required, which virtually increases the cost of hardware and software. 3.3. RAC Process ManagementA RAC database is made up of multiple nodes, each of which is a db instance, each with its own background process and memory structure, and in a RAC cluster, the background process and the memory structure of each instance are the same and, as a whole, look like a single database mirror, but A RAC database is structurally different from a single-instance library:(1)Each instance of the RAC database has at least one additional redo thread (redo thread)(2)Each instance of the RAC database has its own undo table space (undo tablespace)Obviously, this mechanism is each instance that uses its own redo thread and undo table space independently, locking its own modified data separately. This design of RAC separates the operations of multiple instances relatively independently. So how does the RAC database achieve the consistency of the node data, in fact, each RAC instance has a buffer cache (buffer) within the SGA, through the cache fusion technology, RAC synchronization between the various nodes in the SGA cache information, Thus, the consistency of node data is ensured, and the access speed of the cluster is improved. The most important feature of RAC database is sharing, so how to realize the orderly data sharing of multiple nodes, this is the two processes of RAC: the Global Cache Service (GCS) and the Global Enqueue Service (GES) The Global cache Service (GCS) and Global Queue Service (GES) are the most basic RAC processes and are primarily used to coordinate simultaneous access to shared databases and shared resources within a database. At the same time, Ges and GCs record and maintain state information for each data file by using the global Resource directory,grd, while the GRD is stored in memory and the content is distributed across all instances. Each instance manages part of the content. The combination of several special processes and GRD in RAC makes it possible for RAC to use cache fusion technology, which are the following special processes:?Global Cache Service Processes (lmsn)The LMS process is primarily used to manage access to data blocks within the cluster and to transmit block mirroring in the buffer cache for different instances.Global Enqueue Service Monitor (Lmon)Lmon primarily monitors the resource interactions between global resources and clusters within a cluster, and manages instances and handling exceptions, as well as recovery operations for cluster queues.Global Enqueue Service Daemon (LMD)The LMD process primarily manages access to global queues and global resources, updates the status of the corresponding queues, and processes resource requests from other instances.Lock Processes (LCK)The LCK process is primarily used to manage inter-instance resource requests and cross-instance invocation operations, and manages resource requests other than cache fusion, such as requests from the library and row caches.diagnosability Daemon (DIAG)The DIAG process is primarily used to capture diagnostic information for failed processes in the instance and generate the corresponding trace files. 3.4. RAC Database Storage Planningthe software involved in installing the RAC database is Oracle clusterware, Oracle RAC Database software, and also involves voting disk, OCR, and so on, with regard to the amount of space required for each section as follows:Once you understand the amount of disk space required for each part of the RAC, you can plan for the data storage based on the purpose of each section. RAC supports a wide variety of data storage methods, such as single log file system EXT2/EXT3, clustered file system Ocfs2/gfs, Network File system NFS, bare device raw, automated storage management ASM, and so on, the following table lists the types of storage you can use: which storage strategy to use, Will vary depending on the installation of the RAC environment. Three commonly used storage methods are recommended here: Original source

Oracle RAC Cluster architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.