1. symptom
The database instance cannot respond to client-initiated requests.
2. Type
-Oracle processes are waiting for a resource or event
-Oracle Process Spins: Spin refers to a loop in which the code in the Oracle Process is executed. In the v $ session view, you can often see sessions in the Hang.
Always in the "ACTIVE" status.
3. From the perspective of the fault scope, no response fault can be divided into the following situations:
-Hang for a single or partial session
-Hang of a single data instance
-Multiple or all instances in OPS or RAC are Hang
4. Cause Analysis of faults without response
-The Database Host load is too high, seriously exceeding the host's capacity
-- Application design is not released, data performance is low, and the number of active sessions increases greatly
-- Host memory is seriously insufficient, causing a large number of page breaks
-Improper routine maintenance
-- The storage space for archiving logs is full.
-- Move large tables with many DML operations to add foreign key constraints
-- Incorrect resource plan Configuration
-Oracle Database Bug
-Other reasons
-- In a RAC database, if a node exits or joins RAC, the system will be frozen for a period of time when Resource Reconfiguration is performed.
5. troubleshooting Process
-Confirm the impact SCOPE OF THE SYSTEM
-At the same time, ask system maintenance and developers whether the affected system has changed before the fault occurs.
Including hosts, hardware, operating systems, networks, databases, and applications
-Log on to the host to avoid impact analysis due to network, database monitoring, or client Factors
-If you cannot log on to the host, try to shut down the business system, restart the host, and monitor host resources.
-After logging on to the host, run top, topas, and other commands to query CPU usage, physical memory, virtual internal usage, and I/O usage.
-Use SQLPLUS to connect to data, use gdb, dbx, and other debugger tools to perform system state dump on the database, and use strace truss and other tools to check system calls of abnormal Processes
Use tools such as pstack and procstack to view the call stack of abnormal processes.
6. After using sqlplus to connect to the data, perform operations such as hanganalyze and system state dump; check the waiting events, abnormal sessions, and other SQL statements being executed.
7. Find the cause of the fault and collect data as much as possible
8. In case of urgent recovery, you can restore the application by killing the session and restarting the database instance.
9. Based on the final diagnosis result, patch the database or modify the application to fundamentally solve the problem.