Oracle RAC HM (Hang Manager)

Source: Internet
Author: User
Tags time interval

In the Oracle database, suspend (hang) refers to the waiting state entered by a process due to the inability to obtain the requested resources, which can be lifted only after the requested resources have been obtained, and the HM implements the management of hangs, including the monitoring, analysis, recording and resolution of hang.


The wait chain is made up of blocking processes and waiting processes, while one or more root blocking processes exist in the blocking process, which blocks all other processes, and if the root blocking process is busy with some operations, then perhaps the presence of such a wait chain is normal, if the blocking process is idle, Then perhaps the emergence of this wait chain is not normal, and the way to break the wait chain is to terminate the root blocking process. HM can proactively discover the existence of the waiting chain in the database, and from the perspective of the analysis of them, if found to really affect the performance of the data block hang, depending on the specific circumstances to determine whether to solve the problem, and even if not directly resolved, the corresponding diagnostic information will be recorded and continuous monitoring.


The work of HM is composed of seven stages

Phase 1 (Collection Phase): At this stage, the DIA0 process for each instance collects hang analyze information on a regular basis.

Phase 2 (Discovery phase): At this stage, the DIA0 process for each instance analyzes the collected hang Alalyze information, locates the session where hang is present, and sends the DIA0 process to the master node.

Phase 3 (Drawing phase): At this stage, the dia0 process of the master node draws the message from each instance of the DIA0 process, drawing the wait chain.

Phase 4 (Analysis Phase): At this stage, the master node dia0 the process according to the drawn wait chain and analyzes whether hang is indeed present.

Phase 5 (Validation phase): At this stage, the master node dia0 process executes phase 1-4 again, then compares the analysis results of phase 4 with this one, and verifies that hang is really happening.

Phase 6 (Positioning phase): At this stage, the results of the master node dia0 process More validation phase are positioned to the root blocking process of the wait chain.

Phase 7 (resolution Phase): At this stage, the master node dia0 process determines whether hang can be resolved based on the value of the parameter _hang_resoluton_scope.


Parameters of HM

_hang_detection_enabled: This parameter determines whether the HM attribute is enabled in the database, and the default value is true.

_hang_detection_interval: This parameter specifies the time interval for which HM collects hang analyze information, and the default value is 32s.

_hang_verification_interval: This parameter specifies the time interval for the HM Validation hang, and the default value is 46s.

_hang_resolution_scope: This parameter specifies the range that HM can operate when the hang is resolved, the default value is process, and the allowable values are as follows:

OFF: The HM will only continue to monitor hang, and will not do anything to fix hang.

Process: Indicates that HM can resolve hang by terminating the root blocking process, but the root blocking process here cannot be an important background process for the database because it causes the instance to crash.

Instance: Indicates that HM can resolve the hang by terminating the instance.


Trace log files for the DIA0 process

Main trace file (<SID>_DIA0_<PID>.TRC): This log file records the details of the DIA0 process, including the process of discovering, analyzing, and handling the hang.

History Tracker File (<sid>_dia0_<pid>_ N.TRC): Because the trace log file of the DIA0 process constantly generates information as the database runs, it can make the log file very large, and the DIA0 process periodically writes log information to its history log file, where n is a positive integer and increases over time.

Incident Log file: If HM resolves the hang by terminating the process, the ORA-32701 error is first recorded in the Alert.log, and because of the existence of the ADR, the DIA0 process also produces a incident log file that records the details of the problem.


Hang-related views

V$hang_info: This view contains details of the hang that was found by HM.

V$hang_session_info: This view contains the session information related to hang.

V$hang_statistics: This view contains statistics related to hang.


Oracle RAC HM (Hang Manager)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.