Oracle Cluster Concepts and principles (ii)
Overview: write down the original intent and motivation of this document, derived from the Oracle Basic operating Manual of the previous article. The Oracle Basic Operations Manual is a summary of the author's vacation on Oracle's foundational knowledge learning. Then form a summary of the system, a review review, the other is easy to use. This document is also derived from this text. Before reading the Oracle RAC installation and use tutorial, I will first comb the whole idea and formation of this article. Due to the different levels of knowledge storage for readers, I will start with the preparation and planning of Oracle RAC before installing the deployment of Oracle RAC. Began with the guidance of Dr. Tang, the database cluster configuration installation, before and after 2, 3 months of groping. A lot of problems are encountered in the middle. This document will also be documented. This article original/collation, reproduced please mark the original source :Oracle Cluster Concept and principles (ii)
Bai Ningsu July 16, 2015 16:45:02
Oracle Cluster concepts and principles Oracle's three highly available cluster scenarios
1 RAC (Real application Clusters)
Multiple Oracle servers form a shared cache, and these Oracle servers share a network-based storage. This system can tolerate single-or multiple-machine failures. However, many nodes inside the system need high-speed network interconnection, basically is to put all things in a room, or a data center. If the engine room fails, such as the network is not working, then it is bad. So the only use of RAC or can not meet the general Internet company's important business needs, important business needs a multi-engine room to tolerate the accident of a single room.
2 Data Guard. (The main function is redundancy)
The Data Guard scheme is suitable for multi-engine rooms. A computer room a production database, and other computer room to deploy standby database. The standby database is divided into physical and logical. The physical standby database is mainly used for switching after production failure. The logical standby database can share the read load of the production database in peacetime.
3 MAA
MAA (Maximum availability Architecture) is not an independent third, but a combination of the previous two, to provide the highest availability. RAC clusters are deployed in each room, and data guard is used to synchronize between multiple rooms.
RAC Overview
A shared storage file system (NFS), or even a clustered file system (such as OCFS2), is primarily used for storage area networks (where all nodes directly access the storage on the shared file system), which causes the node to fail without affecting access to the file system from other nodes, typically with a shared disk file system for highly available clusters.
The core of Oracle RAC is the shared disk subsystem, where all nodes in the cluster must have access to all data, redo log files, control files, and parameter files, the data disks must be globally available, all nodes are allowed access to the database, each node has its own redo log and control files, However, other nodes must be able to access them so that they can be recovered when a system failure occurs on that node.
Oracle RAC runs on top of the cluster, providing the highest levels of availability, scalability, and low-cost computing power for Oracle databases. If one node in the cluster fails, Oracle can continue to run on the remaining nodes. Oracle's main innovation is a technology called cache merging. Cache merging enables nodes in a cluster to efficiently synchronize their memory caches through high-speed cluster interconnection, thereby minimizing disk I/O to the minimum. The most important advantage of caching is that it enables the disks of all the nodes in the cluster to share access to all data. Data does not need to be partitioned between nodes. Oracle is the only vendor to provide an open systems database with this capability. Other database software that claims to be running on the cluster needs to partition the database data, which is impractical. The enterprise grid is the data center of the future, built on a large configuration of standardized commercial components, including processors, networks, and storage. Oracle RAC's Cache merge technology provides the highest levels of availability and scalability. Oracle database 10g and ORACLERAC 10g significantly reduce operational costs and increase flexibility, giving the system greater adaptability, foresight, and flexibility. Dynamically providing nodes, storage, CPU, and memory can reduce costs by increasing utilization while achieving the required service levels.
RAC Integrated cluster Components management
Oracle RAC 10g provides a fully integrated cluster management solution on all platforms running Oracle database 10g. This set of groupware features includes cluster connectivity, message processing services and locking, cluster control and recovery , and a workload management Framework (to be explored below). Oracle RAC 10g's Integrated cluster components management offers the following benefits:
(i) Low cost. Oracle offers this feature for free.
(ii) Single vendor support. Eliminates the problem of cross-blaming.
(iii) Easy installation, configuration and continuous maintenance. Oracle RAC 10g cluster components are installed, configured, and maintained using standard Oracle database management tools. No additional integration steps are required for this process.
(iv) All platforms, consistent in quality. Oracle has more rigorous testing of new software releases than third-party products.
(v) All platforms with consistent functionality. For example, some third-party cluster products limit the number of nodes that can be supported within a cluster. With Oracle RAC 10g, all platforms can support up to 64 nodes. Users can also get a consistent response experience across all platforms, effectively addressing high availability challenges, including server node failures, interconnect failures, and I/O isolation phenomena.
(vi) Support advanced features. This includes integrated monitoring and notification capabilities to enable rapid and coordinated recovery between the database and application tiers in the event of a failure.
Architecture of RAC
RAC is a clustered solution for Oracle databases and is capable of coordinating operations with two or more two database nodes. As shown in the RAC structure diagram:
The Cluster Manager (Cluster Manager) consolidates other modules in a clustered system, providing communication between cluster nodes through high-speed internal connections. Inter-node connection using the heartbeat line interconnection, the information function on the heartbeat line to determine the cluster logic node member information and node updates, and the node at a certain point in time running state, to ensure that the cluster system normal operation. The communication layer manages communication between nodes. Its role is to configure, interconnect cluster node information, in the cluster Manager to use the information generated by the heartbeat mechanism, the communication layer is responsible for transmission, to ensure the correct arrival of information. There are also cluster monitoring processes that continually verify the different areas of the system's health. For example, heartbeat monitoring constantly verifies that the heartbeat mechanism is functioning well. In an application environment, all servers use and manage the same database to distribute the workload of each server . Hardware requires at least two or more servers, and requires a shared storage device, as well as two types of software, cluster software, and RAC components in the Oracle database. While the OS on all servers should be the same type of OS, depending on the load-balanced configuration policy, when a client sends a request to a service's listener, the server will send the request to the native RAC component for processing according to the load balancing policy, or it may be sent to another server's RAC Component processing, after processing the request, the RAC accesses the shared storage device through the cluster software. Logically structured, each node participating in the cluster has a separate instance that accesses the same database. The nodes communicate with each other through the communication Layer (communication layer) of the cluster software. At the same time, in order to reduce I/O consumption, there is a global cache service, so each instance of the database retains a copy of the same database cache. The features in RAC are as follows:
- L? Every instance of a node has its own SGA;
- L? Each instance of a node has its own background process
- L? Each node's strength has its own redo logs
- L? Each instance of a node has its own undo table space
- L? All nodes share a copy of Datafiles and Controlfiles
Structure composition and mechanism of RAC
Before Oracle9i, RAC was called OPS (Oracle Parallel Server). A big difference between RAC and OPS is that RAC uses the cache Fusion (high-cache merge) technology, which can be updated by another node and then written to disk with the last version before the data block that has been removed by the node has been updated without writing to the disk. In OPS, data requests between nodes require that data be written to disk before the requested node can read the data. When using cache Fusion, the data buffers between the various nodes of the RAC are transmitted over a high-speed, low-latency internal network. is a typical RAC external service, an Oracle RAC Cluster contains the following sections
- Cluster nodes (Cluster node)-2 to N nodes or hosts running Oracle Database Server.
- Private networks (network interconnect)--rac require a high-speed, interconnected private network to handle communications and cache Fusion.
- Shared Storage--rac requires a shared storage device so that all nodes have access to the data files.
- The network of external services (Production networks)--rac the network of external services. Both the client and the application are accessed through this network.
RAC Background Process
Oracle RAC has its own unique background process that does not function in a single instance. As shown, some background processes running on the RAC are defined. The functionality of these background processes is described below.
(1) The LMS (global cache service processes process) process is primarily used to manage access to data blocks within the cluster and to transmit block mirroring in the Buffer cache for different instances. Copy the data block directly from the cache of the controlled instance, and then send a copy to the requested instance. and ensure that the image of one block of data in the Buffer Cache for all instances can only occur once. The LMS process coordinates the access of the data block by passing messages in the instance, and when an instance requests a data block, the instance's LMD process issues a request for a block resource that points to the LMD process of the instance of the master block, the LMD process of the primary instance, and the LMD process of the instance being used to release the resource. The LMS process that owns the instance of the resource creates a consistent read of the block image and then passes the chunk to the buffer CACHE of the instance that requested the resource. The LMS process ensures that only one instance can be updated at a time and is responsible for maintaining the mirrored record of the block (the status FLAG that contains the updated data block). RAC provides 10 LMS processes (0~9), which increase as the number of message-passing data between nodes increases. (2) Lmon (lock monitor process, lock monitoring processes) is a global queue service monitor, each instance of the Lmon process will periodically communicate to check the health of each node in the cluster, when a node fails, responsible for cluster reconfiguration, GRD recovery and other operations, The service it provides is called the Cluster Group Service (CGS).
Lmon mainly uses two kinds of heartbeat mechanism to complete the health examination.
(a) network heartbeat between nodes: It can be imagined that the timing of the nodes between the sending Ping packet detection node status, if you can receive a response within the specified time, it is assumed that the other state is normal.
(ii) by controlling the file's disk heartbeat (Controlfile heartbeat): Each node's CKPT process updates the data block of the control file every 3 seconds, this data block is called Checkpoint Progress Record, control files are shared, so the instances can check each other whether the update to determine the time.
(c) LMD(the Global enqueue Service daemon, lock Management daemon) is a background process, also known as the Global Queue Service daemon, Control access blocks and global queues because of the management requirements for resources. Within each instance, the LMD process manages the input of a remote resource request (that is, a lock request from another instance in the cluster). In addition, it is responsible for deadlock checking and monitoring conversion timeouts.
(iv) LCK(the lockprocess) manages non-cache fusion, and the lock request is a local resource request. The LCK process manages resource requests for instances of shared resources and calls operations across instances. During the recovery process it establishes a list of invalid lock elements and validates the elements of the lock. Because of the primary function of LMS lock management during processing, only a single LCK process exists in each instance.
Five DIAG (The diagnosability daemon , the Diagnostic daemon is responsible for capturing information about process failures in the RAC environment. and write out the trace information for failure analysis, the information generated by DIAG is useful in collaborating with Oracle support technology to find the cause of the failure. Only one DIAG process is required per instance.
(vi) GSD(theGlobal Service daemon) interacts with RAC's management tools DBCA, Srvctl, and OEMs to complete management transactions such as startup and shutdown of the instance. To ensure that these management tools are functioning properly, you must start the GSD on all nodes, and a GSD process supports multiple RAC.GSD process bits in a node $oracle_home/bin directory with a log file of $oracle_home/srvm/log /gsdaemon.log. GCS and GES Two processes are responsible for maintaining the state information of each data file and cache block through the Global Resource directory (Resource directory GRD). When an instance accesses the data and caches the data, the other instances in the cluster get a corresponding block image so that other instances can access the data without having to read the disk again, and instead read the cache directly in the SGA. GRD exists in the memory structure of each active instance, this feature causes the SGA of the RAC environment to be larger than the SGA of a single-instance database system. Other processes and memory structures are not much different from single-instance databases.
RAC shared storage
RAC requires shared storage, independent of information outside the instance, such as the OCR and Votedisk mentioned above, and the data files are stored in this shared store. There are some storage methods such as OCFS, OCFS2, RAW, NFS, ASM, and so on. The OCFS (Oracle Cluster file system) and OCFS2 are just a filesystem, and, like NFS, they provide a shared storage file system in a clustered environment. Raw bare devices are also a way of storage that is supported by RAC in previous versions of oracle11g, and before oralce9i, Ops/rac support can only be used in such a way that the shared storage is mapped to RAW device and the data required by Oracle is selected Raw device storage, but raw is not intuitive to the file system, is not easy to manage, and there are a number of raw device limitations, raw obviously need a new scheme to replace, so there is OCFS such a file system. Of course, this is only Oracle's own implementation of the set file system, as well as other vendors to provide the file system as a storage option. ASM is only a database storage scheme, not a cluster solution, so ASM here should be different from raw and ocfs/ocfs2 the same level of concept, raw and OCFS/OCFS2 not only as a database storage scheme, but also can be used as Clusterwa The storage scheme in re is the storage required in CRS, and ASM is only a database storage, strictly a node application (Nodeapps) in RAC. ASM is not supported for both OCR and Votedisk required for clusterware installation, after all, ASM itself requires an instance, and CRS is completely outside the architecture, which is why ASM is used, but always adds OCFS/OCFS2 and R AW one of the reasons. The comparison of various RAC shared storage methods is as follows:
- Clustered file system-support for Windows and Linux OCFS/OCFS2
- The advantages of the GPFS under AIX, such as the advantage of easy management, is also very intuitive, but the disadvantage is based on file system management software, but also through the OS cache processing, performance and stability are deficient, so it is not suitable for use in the production environment. CRS cluster software files and database files can be supported.
- Raw bare Device mode--through the hardware-supported shared storage system, directly with RAW device storage, can support cluster software files and database files.
- Network File System (NFS)-shared storage via NFS, but requires Oracle-certified NFS to support CRS cluster software files and database files.
- asm--Collection RAW mode I/O high performance and cluster file system easy to manage the advantages of oracle10g, the introduction of shared storage, but itself ASM is required to support Oracle instance, so ASM only support database files, and does not support CRS files.
The difference between a RAC database and a single-instance database
In order for all instances in the RAC to have access to the database, all datafiles, control files, pfile/spfile, and redo log files must be stored on the shared disk and can be accessed concurrently by all nodes. It involves bare devices and clustered file systems. The RAC database is structurally different from a single instance: at least one redo thread is configured for each instance, such as a cluster of two instances with at least 4 redo log group. Two redo group per instance. In addition, an UNDO table space is prepared for each instance.
1, redo and undo, each instance in the database modification when who with whose redo and undo paragraph, each locked their own modified data, the operation of the different instances of the independent open to avoid inconsistent data. The Redo log and archive logs in this case are considered for special consideration when backing up or recovering.
2. The memory and the instances of each node of the process have their own memory structure and process structure. The structure of each node is basically the same. Through cache Fusion, RAC synchronizes the cache information in the SGA between nodes to improve access speed and ensure consistency.
Reference documents
- Oracle Cluster Concepts and principles: Oracle's three highly available cluster scenarios
- Oracle one-off RAC Survival Guide
- Oracle 11gR2 RAC Management and performance optimization
- Oracle_base (recommended)
- Oracle RAC
Article Navigation
- Introduction to cluster concept (i)
- Oracle Cluster Concepts and principles (ii)
- How RAC Works and related components (iii)
- Cache Fusion Technology (IV)
- RAC special problems and combat experience (V)
- ORACLE one-to-G Version 2 RAC ready for use with NFS on Linux (vi)
- ORACLE ENTERPRISE LINUX 5.7 under Database 11G RAC cluster installation (vii)
- ORACLE ENTERPRISE LINUX 5.7 Databases 11G RAC database installation (eight)
- Basic test and use of database 11G RAC under ORACLE ENTERPRISE LINUX 5.7 (ix)
Note: This article original/finishing, reproduced please mark the original source.
Oracle Cluster "Oracle DATABASE 11G RAC Knowledge graphic Detailed tutorial" Oracle Cluster Concepts and principles (ii)