some summary of DRM
1. What is DRM
DRM (Dynamic Resource Management) is a new feature of Oracle 10g, where Oracle uses GRD (Global Resource Service) to record resource information for individual nodes in an Oracle RAC environment. This is specifically managed through both GCS (Global Cache Service) and GES (Global Enqueue Service). Because each node in the RAC has its own SGA and buffer cache, in order to ensure the consistency and high performance of all node cache resources. GCS and Ges Specify an instance of one of the nodes in the RAC to manage the cache, which is resource Master. When rematering or changing the primary node only occurs in the reconfiguration, it will automatically start on two normal operating instances or the instance is closed, and the exception node is kicked out of the cluster. So when node A is the primary node, which is Resource master, the resource is in node A until it is reconfigured.
In theory, the use of DRM, non-master node to the required resources have frequent access requirements, can be promoted to the master node, thereby reducing the number of subsequent cross-node resource access requirements, such as: Cache resource is frequently accessed by Node B, the resource can be from Node A remaster to Node B.
However, as a good RAC application design, access to the same resource from multiple nodes is a problem that should be avoided, and if the same resource is accessed on only one node, then for DRM, it does not exist at all. Second, the DRM process itself consumes resources.
/* Below is an example of the old Bear website: http://www.laoxiong.net/problem-caused-by-drm.html */
In a set of RAC systems, there are intermittent performance problems, but they automatically return to normal after a certain period of time.
From the top 5 in the awr waiting to see:
<span style= "FONT-SIZE:12PX;" >top 5 Timed Events Avg%total ~~~~~~~~~~~~~~~~~~ wait call Event Waits time (s) (MS) Time Wait Class --------------------------------------------------------------------------- Latch:cache Buffers LRU chain 774,812 140,185 181 29.7 other gc buffer busy 1,356,786 61,708 13.1 Cluster latch:object queue Header ope 903,456 55,089 11.7 Other Latch:cache buffers chains 360,522 49,016 136 10.4 Concurrenc GC Current Grant busy 112,970 19,893 176 4.2 Cluster -------------------------------- ----------------------------- </span>
you can see that 3 of the TOP 5 are latch related waits, while the other 2 are related to RAC waits.
If you look at finer waiting data, you can find other problems:
<span style= "FONT-SIZE:12PX;" > AVG%ti Me total Wait Wait Waits Event waits-outs time (s) (ms)/TXN------- -------------------------------------------------------------------latch:cache buffers lru cha 774,812 n/ A 140,185 181 1.9 GC Buffer busy 1,356,786 6 61,708 3.3 latch:o Bject Queue header o 903,456 N/a 55,089 2.2 Latch:cache buffers chains 360,522 N/A 49,016 136 0.9 GC Current grant busy 112,970-19,893 176 0.3 GCs DRM F Reeze in Enter serv 38,442 18,537 482 0.1 gc CR block 2-way 1,626,280 0 15,742 3.9 gc remaster 6,741, 12,397 1839 0.0 roW Cache lock 52,143 6 9,834 189 0.1 </span>
from the above data can also be seen, in addition to the top 5 waits, there is "GCs DRM freeze in Enter server Mode" and "GC remaster" These 2 relatively rare wait events, from their name, are obviously related to DRM. So what is the correlation between the 2 wait events and the top 5 event? MOS document "Bug 6960699–" Latch:cache buffers Chains "Contention/ora-481/kjfcdrmrfg:sync Timeout/oeri[kjbldrmrpst:!master" [ID 6960699.8] "mentioned, DRM may indeed cause a lot of" latch:cache buffers Chains "," Latch:object Queue header operation "Wait, although the document does not mention, but does not exclude will cause "Latch:cache buffers LRU chain" such waiting.
to further verify that the performance issue is related to DRM, use the tail-f command to monitor the trace files of the LMD background process. When you start DRM in the trace file, query the V$session view and discover a large number of "Latch:cache buffers Chains", "Latch:object Queue header operation" Wait events, and " GCS DRM freeze in Enter server Mode "and" GC remaster "wait for events while the system load increases and the foreground reflects performance degradation. After the DRM is complete, these waits disappear and the system performance returns to normal. The
seems to be able to avoid this problem by just turning off DRM. How do I close/disable DRM? Many MOS documents refer to a method that sets 2 implied parameters:
<span style= "FONT-SIZE:12PX;" >_gc_affinity_time=0 _gc_undo_affinity=false </span>
Unfortunately, these 2 parameters are static parameters, which means that the instance must be restarted to take effect.
You can actually set 2 additional dynamic implicit parameters to achieve this. After setting these 2 parameters by the following values, DRM cannot be completely banned/closed, but DRM is turned off from "de facto".
<span style= "FONT-SIZE:12PX;" >_gc_affinity_limit=250 _gc_affinity_minimum=10485760 </span>
You can even set the above 2 parameter values to a larger size. These 2 parameters are effective immediately, after setting these 2 parameters on all nodes, the system is no longer DRM, often for a period of time observation, the performance problems described in this article no longer appear.
The following is the wait event data after the DRM is turned off:
<span style= "FONT-SIZE:12PX;" >top 5 Timed Events AVG%total ~~~~~~~~~~~~~~~~~~ Wait Call Event Waits time (s) (ms) time wait Class---------------- -----------------------------------------------------------CPU Time 15,684 67.5 db file sequential read 1,138,905 5,212 5 22.4 User I/o GC CR block 2-way 780,224 285 0 1.2 Cluster log file sync 246,580 246 1 1.1 Commit sql*net more data from client 296,657 236 1 1.0 Network------------- ------------------------------------------------ AVG%time Total Wait Wait Waits Event Waits-outs time (s) (ms)/txn----------------------------------------------------------- ---------------DB file sequential read 1,138,905 N/a 5,212 5 3.8 GC CR block 2-way 780,224 N/A 285 0 2.6 log file Sync 246,580 0 246 1 0.8 sql*net More data from Clien 296,657 N/a 236 1 1.0 sql*net Message from Dbl Ink 98,833 N/a 218 2 0.3 GC current block 2-way 593,133 N/a 218 0 2.0 GC CR Grant 2-way 530,507 N/a 154 0 1.8 db file scattered read 54,446 N/A 151 3 0.2 kst:async disk IO 6,502 N/a 107 0.0 GC CR multi block request 601,927 N/A 0 2.0 sql*net more data to Clien T 1,336,225 N/A 0 4.5 log file parallel write 306,331 N/A 0 1.0 GC current Block busy 6,298 N/a 0.0 backup:sbtwrite2 4,076 N/A 0.0 GC Buffer Busy 17,677 1 3 0.1 gc Current Grant busy 75,075 N/a 1 0.3 direct path read 49,246 N/A 1 0.2 </span>
understands that: DRM (Dynamic Resource Management) theoretically implements the promotion of a non-master node to the master node, Can reduce cross-node resource access, but it brings more problems. If there are two nodes in a RAC cluster, Node 2 caches a large and large table during the idle period, and when the business is busy, node 1 needs access to the table, and if there is no DRM, it is accessed from the store, but if there is DRM, the cache resource is found in Node 2. This resource is passed to Node 1 from the Cache in Node 2, which consumes a lot of bandwidth and consumes many resources.