RAC Cache Fusion Principle Understanding

Source: Internet
Author: User

Cache Fusion.  Grd.   Drm.   Gcs. Ges


Cache Fusion
1.RAC is a database execution on multiple instances. Resolve concurrency issues through the DLM (Distributed lock Management): Distributed lock manager. The shared resources between the various nodes of the RAC, in order to ensure the consistency of each node's access to data. It is therefore necessary to use DLM to coordinate resource contention among the various instances. This DLM is called Cache Fusion in RAC.


2. In cache fusion, each block of data is mapped to the cache fusion resource. The resource is actually a data structure, the name of the resource is the block address, the block request process: First convert the block address X into the cache fusion resource name, and then submit the cache Fusion resource request to DLM,DLM for the global Lock application. Release activity, only the process obtains the PCM lock talent to continue the next step, namely: the instance needs to obtain the right to use the data block.


(Convert the address to the cache Fusion resource----> Submit the resource to the DLM----> access to the Resource)


GRD
1.GRD (Global Resource Directory) can be seen as an internal database that records the distribution of each chunk between clusters in each instance of the SGA. Each instance of GRD in the SGA becomes a complete grd. The usage information for this resource on all nodes is recorded on the RAC's master node. The usage information for each node is recorded on this node.


Drm
1.DRM (Dynamic Resource Management), when a non-master node's resources are frequently visited, the node can be promoted to the master node through DRM and the node remaster into the master node.
2. Problems caused by the use of DRM:
Experiment.....


Gcs
1.GCS Global Cache Service: To be understood together with the cache fusion.

The global cache is designed to block data.

The global cache service is responsible for maintaining cache consistency within the global cache store, ensuring that an instance can obtain a global lock resource whenever a block is changed at any one time, thereby avoiding the possibility of another instance changing the block at the same time. The instance that makes the change will have the current version number of the block (including committed and uncommitted transactions) and the Block's front image (post image).

Assuming that another instance also requests the block, the global cache service is responsible for tracking the instance that owns the block, what the version number of the owning block is, and what mode the block is in. The LMS process is a key component of the global cache service.
2.LMSn (LOCK MANAGER sercive) is responsible for the transfer of data blocks between instances, controlled by the number of gcs_sercer_processes, with a default value of 2. The value range is 0-20. The


GES
1.GES (Global enqueue Service) Global Queue service: primarily responsible for maintaining consistency within the dictionary cache and in the library cache. The dictionary cache is a cache of data dictionary information stored in an instance's SGA for quick access. Because the dictionary information is stored in memory, changes to the dictionary on a node, such as DDL, must be propagated immediately to the dictionary cache on all nodes. GES is responsible for dealing with the above situation and eliminating the differences between instances. For the same reason, in order to parse the SQL statements that affect these objects, the library cache locks on the objects in the database are removed. These locks must be maintained between instances, and the global queue service must ensure that there are no deadlocks between multiple instances of the same object being requested. The Lmon, lck, and LMD processes work together to implement the functionality of the Global queue service. GES is in addition to the maintenance and management of the data block itself (by the completion of GCS). An important service that adjusts other resources between nodes in a RAC environment. The Lmon process for each instance of
2.LMON
communicates periodically. To check the health status of each node in the cluster. Responsible for cluster reconfiguration when there is a problem with a node. GRD recovery operations, it provides services called: Cluster Group Services (CGS).


Lmon The main use of two heartbeat mechanisms to complete the Health check:
1- Network heartbeat between nodes: the ability to imagine that a node periodically sends ping packets to check the state of a node, assuming that it can receive a response within a specified time, it feels the state is Heartbeat.
2-by controlling the disk heartbeat of the file (Controlfile Heartbeat), each node's CKPT process updates the control file a block of data every 3 seconds, This data block is called the checkpoint Progress Record, the control file is shared, so the instances can check with each other to see if the other is updated in time to infer.


3.LCK
This process is responsible for synchronizing access to Non-cache Fusion resources. Each instance has a LCK process.
4.LMD
This process is responsible for the global Enqueue Service (GES). In detail, this process is responsible for coordinating the order of access to data blocks across multiple instances. Ensure consistency of data access. Together with the GCS service of the LMSN process and GRD, it is the core of the RAC feature cache fusion.




Global Resource directory is managed by the global Cache Service
Records the mode of the resource, the role of the resource, the state of the block in the instance, the master of the resource is advertised at each active node, and the master (such as the startup and shutdown of the instance) when necessary


Global Cache Service
1, Resource mode. Three kinds
Null (default)
Share (s) (query)
Exclusive (x) (can change the contents of the block.) The other example is null mode).
2. Resource role, two kinds of
Local: First-time mode of requesting resources; Only one instance can have this block of dirty copy
Global: When a block becomes dirty in multiple instances. Local becomes a global block that can only be written to disk by the global Cache service


Transfer of Cache Fusion block
For example: With ABCD four nodes, Global Cache Service:gcs
1.Read with no transfer
Assuming that the C node needs to read a block to the shared disk file, it sends a request to the global Cache service, at which point the request is directed to the node d,d is the master of this block (each resource has a master). GCS authorizes the resource to share mode and local Role, records his status in the folder (folder in node D), and then notifies c,c to change the resource from NULL to share.

C starts I/O, and now C has this block read from the disk file in share mode.




2.Read to write transfer
b also want this block, and not just read. And it's going to change its contents.

B sends a request to the GCs of D (the master of this block), and the GCS makes a request to C. Ask C to give the block to B. C give the block to B. b after receiving, tell GCs, now B can change this block.

3.Write to Write transfer
A request is made to the D-node's GCS. The GCs tells the B node to discard his exclusive lock, and uploads the current image to a, assuming that the request is not complete and will be placed in the GCS queue. b upload the block to a. This time to write log, forcing LG Flush. Turns the pattern to null.

Send to a and tell it that this exclusive resource is available. A receives the image of this block. Will notify the GCs and tell it that the status of block is exclusive, and this time, B cannot operate on this block. In its buffer cache, though. It also has copy of this block.



4.Write to read transfer
C to read this block, first make a request to D (Master), GCS requires a to transmit it to the c,a to accept the request completed its work. This may be written in a log and log FLUSH prior to sending this block. A will reduce its exclusive lock to share mode. C Take out the SCN of the block received from a. Build into a resource assumption information for GCS update Global Resource Directory.

By setting the number of parameters gc_files_to_locks. Ability to turn cache fusion off.
The cache resource on one node does not need to continue master,dynamic remastering can move it to a different node.


Problem:
1. The block is not read by all instances, while the first instance is read. How to add the lock, plus what lock? Suppose there is an instance to read this block at this point, almost at the same time, then how does Oracle arbitrate and how to make one read. And then there's a cache in the past that gets it through the caches?
2. Suppose that a block has been read in by another instance. So how does this example infer its existence?
3. If an instance changes the data block, will the change be passed to another instance, or will the other instances know and update the state again?
4. Suppose that an instance is going to swap out a block, while other instances also have the cache of this block, the changed and unaltered, the changes in this instance and other instances. How does it work? Truncate a table, what's the difference between a drop table and a single instance?


5. How should the application be designed so that the RAC really works, rather than introducing competition. Cause the system to be weakened?
6.RAC the implementation of the lock.

Locks are resources that are retained in the SGA of each instance and are often used to control access to database blocks. Each instance typically retains or controls a certain number of locks associated with the block range.

When an instance requests a block, the block must obtain a lock, and the lock must come from an instance that currently controls those locks. That is, locks are distributed on different instances.

And to get a specific lock to be obtained from a different instance.

However, from this point of view, these locks are not fixed on an instance, but are adjusted to the most frequently used instances based on the lock's request frequency, thus improving efficiency.




1. An A-instance read block is required to send a request to the GCs, and the Block's master instance B authorizes the resource to share MODE through GCS. At Master Node B, the state is logged, and then node A in the notification request is changed from NULL to share, starting I/O,
So at this point, the request resource's Node A plus SHARE lock. Suppose there is another instance C to read the block, notify master Node B of the GCS issued, require a to give block to C.
2. When an instance requests a block, it is necessary to access the master node of the block, at which point the master node of the block will track the instance of the block with GCs, and what the block's version number is. And what mode the block is in. There are records in the master node.
3. If an instance changes the data block, the disk heartbeat mechanism in the GES Lmon process functions, and each node's CKPT process updates a data block of the control file every 3 seconds, and the control file is shared to check whether it is updated in a timely manner.


4. View the current state of the block of the master node. Assume that the changed block is before writing log and log flush. The block owned by the current node is reduced from the exclusive lock to the share lock.
5. This is achieved through GCS and GES.


Blog: http://www.cnblogs.com/sopost/archive/2013/03/14/2960490.html
http://blog.csdn.net/tianlesoftware/article/details/5353087
"Oracle RAC"

RAC Cache Fusion Principle Understanding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.