Original link: http://tech.it168.com/a2012/0814/1384/000001384756_all.shtml
Different cluster products have their own characteristics, RAC features include the following points:
• Dual-machine parallel. RAC is a parallel mode, not a traditional master and standby mode. That is, all members of the RAC cluster can receive client requests at the same time.
• High availability. RAC is a high-availability solution for Oracle database products that ensures that only one node in the cluster will survive and be able to provide services to the outside.
• Ease of scalability. RAC makes it very easy to add and remove nodes to meet the system's own adjustments.
• Low cost. The ability to use lower-cost servers for high-availability, high-throughput clustered environments is much lower than the cost of high-availability, high-throughput costs by adding hardware to a high-end server.
• High throughput. As the number of nodes increases, the throughput of the entire RAC is growing.
These five features are discussed in detail below.
One, two parallel machine
RAC is a high availability implementation that takes full advantage of server resources, and the parallel mode implementation of RAC is quite different from the traditional two-machine hot-standby implementation, which is a comparison between the two.
1-4, two nodes in the traditional dual-machine hot standby environment, there is always a machine as a standby machine, only when the main node is in trouble to switch to the standby machine, if the host has not been the problem, then the standby is always idle, which is a huge waste of resource utilization and cost. But RAC is a parallel mode architecture, that is, two nodes of the cluster node is a parallel operation of the relationship, when a machine problem, the request will be automatically forwarded to another machine, no one machine as a standby machine has not been used, so that the full use of server resources. At the same time, the traditional dual-machine hot-standby architecture in the case of problems, often require a few minutes of switching time, and RAC in the event of a problem, for the existence of the session only takes a few 10 seconds to complete the failover process, the creation of new sessions will not have an impact on the switching time also has a relatively large advantage.
Second, high availability
RAC is an Oracle database high availability solution. High availability consists of two parts: the first is to ensure that the data is not lost under this solution, which is the most basic and must be ensured, and secondly, to ensure that the Oracle database remains in a normal state of operation and avoids the loss of downtime to customers, which is the most discussed content.
Downtime is generally divided into two categories, planned downtime and unplanned downtime. A planned outage is a planned outage of a node or system, typically occurring in the case of Oracle upgrades, system maintenance, or hardware maintenance. Unplanned outages are sudden outages in non-human-planned situations, which typically occur when Oracle bugs, system failures, hardware failures, or man-made operations fail.
In the absence of a high cost, it is almost impossible to achieve 100% of the system's non-stop. The following table lists the time that a specific percentage of high availability ratios are running down, detailing the maximum downtime per year, month, and week for each high availability rate.
Typically, the corresponding availability ratio is calculated with monthly downtime. Depending on the importance of the system, a reasonable availability ratio should be set for the system.
The greatest advantage of clustering is its high availability, which can be used to some extent by avoiding data loss and unplanned downtime due to hardware or software failures and, to a certain extent, reducing or excluding planned downtime. This is the most direct reason why many customers choose RAC.
The RAC contains a very high number of highly available features, consisting mainly of the following points:
• Achieve load balancing between nodes.
• The ability to implement fail-over switching.
• Control the client access path through the service component.
• The cluster software automates the management of each resource and has a timed node state detection mechanism that automatically restarts some failed processes and the nodes that failed heartbeat detection to restore them back to normal operating conditions.
In the Oracle 11GR2 release, Clusterware has been improved to provide higher availability. For example, a large number of new agent-based monitoring systems are used to monitor all resources. These agents use fewer resources to perform more frequent checks, which are faster failure scans and shorter recovery times. In the Oracle listener example, the average failure scan time decreased from 5 minutes to 30 seconds, while the check interval decreased from every 10 minutes to 1 minutes. In addition, Clusterware's "Out-of-place Upgrade" features also reduce the downtime required for software maintenance.
Third, easy scalability
RAC provides scalability for applications that need to be re-planned. In order to maintain a low cost during the initial stage of the system and avoid unnecessary waste, the cluster can set up the database environment according to the standard hardware configuration, select the appropriate server resources and storage resources. When the system needs more processing power or needs to increase storage, by adding another server or storage device to the cluster, it is able to gain a level of expansion without downtime. In one cluster, Clusterware and RAC support up to 100 cluster nodes.
When a cluster has too much processing power and another cluster has insufficient processing power, it can move one node to a cluster with insufficient processing power from a cluster with excess capacity. This enables the full utilization of server resources and cost savings. Grid Plug and Play (grid Plug and PLAY,GPNP) has been introduced in the 11GR2 release to enable Quick node additions.
Four, low cost
A cluster of common PC servers can improve the processing power of the cluster, which is much lower than the cost of using a high-performance server. Adding nodes to a cluster is much easier than adding hardware to a high-performance server if you want to improve the processing power of your system. In addition, the use of clusters can also be used to dynamically remove nodes, more fully utilize the management of all server resources, from the overall use of the server to reduce the cost of the server procurement. More and more enterprises are willing to apply clustering solutions to their systems to reduce costs and improve system availability.
Five, high throughput
RAC is a logical body consisting of multiple servers that can receive more client requests than a single database server. This can be very evident in systems that require high throughput. In a RAC schema, multiple instances are distributed across multiple servers, can open the same database at the same time, and each instance can receive an equal number of client requests, thus increasing the throughput as the server increases.
Among the features discussed above, high availability is the largest feature of RAC.
Problems with RAC
While RAC has a number of advantages, the deployment of a RAC will involve a wide range of technologies such as servers, storage devices, HBA cards, operating systems, and more complex implementations than single-instance databases, as well as the stability of hardware devices, the compatibility of devices and devices and operating systems, Oracle bugs can also cause problems with RAC running. Therefore, from the actual operating situation, RAC than the single-instance database there are more problems, the cause of the problem is different. The problems with RAC exist primarily in terms of stability and performance, and these two issues are discussed below.
First, stability
The stable operation of the database is the foundation and premise of the stable operation of the system, and the operation of the database depends on the operation of the operating system, server, storage device and so on.
Because of the different hardware devices, operating system vendors, sometimes there is a compatibility problem, even if the same vendor's server, due to driver, firmware version of the difference may also lead to hardware problems and compatibility issues with other devices. At the same time, due to the many bugs in RAC itself, many deployed RAC environments lack the environment to be inspected and tested prior to launch, resulting in a series of unstable situations during the run, so that high availability is not fully reflected.
In view of this, a stable hardware environment coupled with a stable RAC version determines the stability of the RAC operation. It is important for database engineers and hardware engineers to have extensive environmental checks, validations, and extensive testing after deployment before installing the configuration.
Second, high-performance
High performance is also a major headache for migrating from a single machine environment to a RAC environment, and RAC is not a high-performance solution. In the current hardware environment of gigabit Network, many times the system's database is migrated from the original single machine to the RAC environment, the system performance decreases. In this case, the database administrator should make reasonable suggestions according to the characteristics of RAC, and after reasonable design and development, using RAC can improve the processing performance of the system.
The above two issues require special attention. In addition, good communication with hardware engineers, system developers, and reasonable design of the system is the precondition to ensure the stable operation and high performance of RAC.
Oracle RAC Basic Concepts