Hanganalyze3 analyzes and summarizes session-level: ALTERSESSIONSETEVENTSimmediatetracenameHANGANALYZElevellevel; instance-level: ORADEBUGhanganalyzelevel; and cluster-wide: Executor-gdefhanganaly
Hanganalyze 3 analyzes and summarizes the following methods: alter session set events immediate trace name HANGANALYZE level; instance level: ORADEBUG hanganalyze level: ORADEBUG setmypid ORADEBUG setinst all ORADEBUG-g def hanganaly
Hanganalyze 3 analysis summary
One is session-level:
Alter session set events 'immediate trace name HANGANALYZE level ';
One is the instance level:
ORADEBUG hanganalyze
One is the cluster scope:
ORADEBUG setmypid
ORADEBUG setinst all
ORADEBUG-g def hanganalyze
First, explain the meaning of each level:
1-2: only the hanganalyze output, do not dump any process
3: Level2 + Dump processes in the IN_HANG State [simplest and easiest to judge]
4: Level3 + Dump blockers in the wait chain (Status: LEAF/LEAF_NW/IGN_DMP)
5: Level4 + Dump all processes in the waiting chain (Status: NLEAF)
The hanganalyze report is divided into multiple parts. The session segment information is always described in detail by a header. The Oracle8i and 9i information are slightly different:
Sid is the Session ID.
Sess_srno is serial #
Proc_ptr is Process Pointer
The ospid is the OS Process ID.
Cnode is the Node Id, which is used only by Oracle9i.
Nodenum is hanganalyze
Numbers customized to record these sessions start from 0.
State is the node State.
Adjlist is a nearby node (usually a blocker node)
Predecessor is a Predecessor node, which usually represents a waiter node.
Next I will explain some important node states:
IN_HANG: This indicates that the node is in a deadlock state, and other nodes (blocker) are also in this state.
LEAF/LEAF_NW: the node is usually blocker. The "predecessor" column of the entry determines whether the node is blocker.
LEAF indicates that the NODE is not waiting for other resources
LEAF_NW may be not waiting for other resources or using CPU.
The following example shows that node 16 blocks Node 19 Resources:
Nodenum]/cnode/sid/sess_srno/session/
Ospid/state/start/finish/[adjlist]/predecessor
[16]/0/17/154/0 x24617be0/26800/LEAF/29/30/19
[19]/0/20/13/0x24619830/26791/NLEAF/33/34/[16]/186
NLEAF: Generally, it can be seen that these sessions are blocked resources. This usually indicates that the database has performance problems rather than the database hang.
IGN/IGN_DMP: This type of session is generally considered as an idle session, unless the node exists in its adjlist column.
If the session is not idle, the node in its adjlist is waiting for other nodes to release resources.
SINGLE_NODE/SINGLE_NODE_NW: similar to idle sessions
Practice hanganalyze:
Test Case
C: \ Users \ Administrator> SQLPLUS/AS SYSDBA
SQL * Plus: Release 10.2.0.4.0-Production on Sat May 28 15:30:03 2011
Copyright (c) 1982,200 7, Oracle. All Rights Reserved.
If the database has hang and logon fails, run systemdump to check the cause.
C: \ Users \ Administrator> sqlplus-prelim '/as sysdba'
SQL * Plus: Release 10.2.0.4.0-Production on Sat May 28 15:31:13 2011
Copyright (c) 1982,200 7, Oracle. All Rights Reserved.
Enter password:
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit;
Statement processed.
SQL> oradebug dump systemstate 10
Statement processed.
SQL> oradebug tracefile_name
C: \ oracle \ product \ 10.2.0 \ admin \ systex \ udump \ export ex_ora_4636.trc
SQL> oradebug hanganalyze 3
Hang Analysis in c: \ oracle \ product \ 10.2.0 \ admin \ systex \ udump \ export ex_ora_6824.trc
Let's take a look at the trace of hanganalyze.
Chain 1: :
<0/139/105/0x18263170/2932/latch: shared pool> /// it seems that 139 blocks the session.
-- <0/164/1/0x1825f030/4592/latch: library cache>
Other chains found:
Chain 2: :
<0/135/1/0x18269350/3956/Streams AQ: waiting for time man>
Chain 3: :
<0/142/2/0x18266260/3816/Streams AQ: qmn slave idle wait>
Chain 4: :
<0/146/90/0 x182641c0/5944/latch: shared pool>
Chain 5: :
<0/151/9/0x18265a38/6060/Streams AQ: qmn coordinator idle>
Chain 6: :
<0/159/1138/0x18262948/6100/latch: shared pool>
Chain 7: :
<0/161/1/0x182608a8/4856/latch: shared pool>
Chain 8: :
<0/162/1/0x18260080/5904/latch: shared pool>
OK. Search for 2932 in systemdump.
O/S info: user: SYSTEM, term: WILLSON-LI, ospid: 2932
OSD pid info: Windows threadid: 2932, image: ORACLE. EXE (SHAD)
Dump of memory from 0x000007FF18249EC0 to 0x000007FF1824A0C8
7FF18249EC0 00000006 00000000 17615728 000007FF [...... (Wa ......]
7FF18249ED0 00000010 000313A7 183ADF60 000007FF [......]
7FF18249EE0 00000003 000313A7 1798B0E8 000007FF [......]
7FF18249EF0 0000000B 000313A7 1836A1C0 000007FF [......]
7FF18249F00 00000004 0003129B 16835088 000007FF [...]
7FF18249F10 00000007 000313A7 16836020 000007FF [......]
7FF18249F20 00000007 000313A7 00000000 00000000 [......]
7FF18249F30 00000000 00000000 00000000 00000000 [......]
Repeat 24 times
7FF1824A0C0 00000000 00000000 [......]
(FOB) flags = 2 fib = 000007FF169BB290 incno = 0 pending I/o cnt = 0
Fname = C: \ ORACLE \ PRODUCT \ 10.2.0 \ ORADATA \ SYSTEX \ CONTROL04.CTL
Fno = 0 lblksz = 16384 fsiz = 430
(FOB) flags = 2 fib = 000007FF169BB630 incno = 0 pending I/o cnt = 0
Fname = C: \ ORACLE \ PRODUCT \ 10.2.0 \ ORADATA \ SYSTEX \ SYSTEM01.DBF
Fno = 1 lblksz= 8192 fsiz = 64000
----------------------------------------
SO: 000007FF1836A1C0, type: 4, owner: 000007FF18263170, flag: INIT/-/0x00
(Session) sid: 139 trans: 000007FF16ADE418, creator: 000007FF18263170, flag: (100041) USR/-BSY /-/-/-/-/-
DID: 0001-0010-00000029, short-term DID: 0000-0000-00000000
Txn branch: 0000000000000000
Oct: 0, prv: 0, SQL: 0000000000000000, psql: 000007FF17EABB40, user: 0/SYS
Service name: SYS $ USERS
O/S info: user: WILLSON-LI \ Administrator, term: WILLSON-LI, ospid: 5588: 5372, machine: WORKGROUP \ WILLSON-LI
Program: sqlplus.exe
Application name: sqlplus.exe, hash value = 254292535
Waiting for 'latch: shared pool 'blocking sess = 0x0000000000000000 seq = 273 wait_time = 0 seconds since wait started = 129
Address = c972f38, number = d6, tries = 0 // --- waiting for latch c972f38
Dumping Session Wait History
For 'SQL * Net message from client' count = 1 wait_time = 61829702
Driver id = 42455100, # bytes = 1, = 0
For 'SQL * Net message to client' count = 1 wait_time = 3
Driver id = 42455100, # bytes = 1, = 0
For 'ksdxexeotherwait' count = 1 wait_time = 2
= 0, = 0, = 0
For 'ksdxexeotherwait' count = 1 wait_time = 2
Omitted... Always waiting for ksdxexeotherwait
--- Search for c972f38 again
SO: 000007FF18263170, type: 2, owner: 0000000000000000, flag: INIT/-/0x00
(Process) Oracle pid = 16, callcur/top: 000007FF183ADF60/000007FF183ADF60, flag: (0 )-
Int error: 0, call error: 0, sess error: 0, txn error 0
(Post info) last post received ed: 0 0 0
Last post received-location: No post
Last process to post me: none
Last post sent: 0 0 0
Last post sent-location: No post
Last process posted by me: none
(Latch info) wait_event = 0 bits = 20
Location from where call was made: kghalo:
Waiting for c972f38 Child shared pool level = 7 child # = 1
Location from where latch is held: kghupr1: Chunk Header
Context saved from call: 343209960
State = busy, wlstate = free
Waiters [orapid (seconds since: put on list, posted, alive check)]:
11 (283,130 6567910, 283)
15 (268,130 6567910, 268)
10 (268,130 6567910, 268)
16 (129,130 6567910, 129)
23 (120,130 6567910, 120)
24 (108,130 6567910, 108)
Waiter count = 6
Gotten 586960 times wait, failed first 238 sleeps 8
Gotten 0 times nowait, failed: 0
On wait list for c972f38 ----------------------------------------------
Holding (efd = 7) 15e479a0 Child library cache level = 5 child # = 2
Location from where latch is held: kglobpn: child: latch
Context saved from call: 1
State = busy, wlstate = free
We can see that SO: 000007FF18263170 holding the Child library cache is waiting for the Child shared pool.
SQL> oradebug poke 0xc972f38 4 0
BEFORE: [00C972F38, 00C972F3C) = 00000001
AFTER: [00C972F38, 00C972F3C) = 00000000
Others:
1. Before 10 Gb, use tools such as DBX and GDB:
Find the PID (ps-ef | grep LOCAL) of a front-end process, and use gdb $ ORACLE_HOME/bin/oracle for tracking.
Oracle @ gurufl02: ~> Ps-ef | grep LOCAL
Oracle 26358 26357 0? 00:00:00 oracleora9 (DESCRIPTION = (LOCAL = YES) (ADDRESS = (PROTOCOL = beq )))
Oracle 27848 27810 0 00:00:00 pts/3 grep LOCAL
Oracle @ gurufl02: ~> Gdb $ ORACLE_HOME/bin/oracle 26358
Run print ksudss (10)
(Gdb) print ksudss (10)
[Switching to Thread 4154533552 (LWP 26358)]
Warning: Unable to restore previusly selected frame.
$1 = 181773020
Current language: auto; currently asm
In this case, a TRACE file is generated under UDUMP or BDUMP. It contains system state dump.
In addition to using print ksudss (10) for systemstate dump, you can also perform the following dump
Print ksdhng (3,1, 0) is equivalent to oradebug hanganalyze 3
Print ksudps (10) is equivalent to oradebug dump processstate 10
Print curdmp () is equivalent to oradebug call curdmp (that is, oradebug dump cursordump)
Print ksdtrc (4) is equivalent to oradebug dump events 4 (here the parameter indicates level, 1-session, 2-process, 4-system)
Print ksdsel (10046, 12)-equivalent to setting 10046 event level 12 for attach Processes
Print skdxipc ()-equivalent to oradebug ipc
Print skdxprst ()-equivalent to oradebug procstat
Of course, if oradebug can be used, oradebug should be used. After all, it is much more convenient and safer.
2. 10 Gb sqlplus-prelim
If 10 Gb, you can use the sqlplus-prelim option to force logon.
Export ORACLE_SID = ora9
Sqlplus-prelim '/as sysdba'
Oradebug setmypid
Oradebug unlimit;
Oradebug dump systemstate 10
3. If none of the above conditions are met, you can use pstack to obtain the stack information. Alternatively, you can use Truss, tusc, straceAnd other tools to see what they are waiting.
Database HANG occupation is a headache. How to find the cause of HANG occupation is a challenge that DBAs must face