oracle systemstate dump介紹

來源:互聯網
上載者:User

標籤:

    當資料庫出現嚴重的效能問題或者hang了的時候,伺服器端sqlplus也無法串連時,此時如果想擷取資料庫當前的狀態資訊,以便事後診斷,那麼我們非常需要通過systemstate dump來知道進程在做什麼,在等待什麼,誰是資源的持有人,誰阻塞了別人。在出現上述問題時,及時收集systemstate dump非常有助於問題原因的分析。ORACLE 10g 開始,sqlplus提供了這麼一個功能參數-prelim,在sqlplus無法串連的情況下,串連登入到資料庫。下面關於這些知識點的一個總結

 

There are two ways to connect to sqlplus using a preliminary connection.

sqlplus -prelim / as sysdba
 
sqlplus /nolog
set _prelim on
connect / as sysdba

 

用sysdba登入到資料庫上:

$sqlplus / as sysdba

或者

$sqlplus -prelim / as sysdba <==當資料庫已經很慢或者hang到無法串連

 

Collection commands for Hanganalyze and Systemstate: Non-RAC:
Sometimes, database may actually just be very slow and not actually hanging. It is therefore recommended,  where possible to get 2 hanganalyze and 2 systemstate dumps in order to determine whether processes are moving at all or whether they are "frozen".

Hanganalyze
sqlplus ‘/ as sysdba‘
oradebug setmypid
oradebug unlimit
oradebug hanganalyze 3
-- Wait one minute before getting the second hanganalyze
oradebug hanganalyze 3
oradebug tracefile_name
exit

Systemstate
sqlplus ‘/ as sysdba‘
oradebug setmypid
oradebug unlimit
oradebug dump systemstate 266
oradebug dump systemstate 266
oradebug tracefile_name
exit

 

Collection commands for Hanganalyze and Systemstate: RAC
There are 2 bugs affecting RAC that without the relevant patches being applied on your system, make using level 266 or 267 very costly. Therefore without these fixes in place it highly unadvisable to use these level

For information on these patches see:
Document 11800959.8 Bug 11800959 - A SYSTEMSTATE dump with level >= 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS - can hang/crash instances
Document 11827088.8 Bug 11827088 - Latch ‘gc element‘ contention, LMHB terminates the instance
 
Note:  both bugs are fixed in 11.2.0.3.
 
Collection commands for Hanganalyze and Systemstate: RAC with fixes for bug 11800959 and bug 11827088
For 11g:
sqlplus ‘/ as sysdba‘
oradebug setorapname reco
oradebug  unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 266
oradebug -g all dump systemstate 266
exit
Collection commands for Hanganalyze and Systemstate: RAC without fixes for Bug 11800959 and Bug 11827088
sqlplus ‘/ as sysdba‘
oradebug setorapname reco
oradebug unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 258
oradebug -g all dump systemstate 258
exit
For 10g, run oradebug setmypid instead of oradebug setorapname reco:
sqlplus ‘/ as sysdba‘
oradebug setmypid
oradebug unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 258
oradebug -g all dump systemstate 258
exit
In RAC environment, a dump will be created for all RAC instances in the DIAG trace file for each instance.

 

那麼我們現在來看一個例子吧:

[[email protected] ~]$ sqlplus -prelim / as sysdba
 
SQL*Plus: Release 10.2.0.5.0 - Production on Wed Mar 2 16:31:03 2016
 
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
 
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit
Statement processed.
SQL> oradebug dump systemstate 266
Statement processed.
SQL> oradebug dump systemstate 266
Statement processed.
SQL> oradebug tracefile_name
/u01/app/oracle/admin/SCM2/udump/scm2_ora_13598.trc
SQL> exit
Disconnected from ORACLE

警示日誌裡面會看到類似這樣的資訊:

Wed Mar 02 16:32:08 CST 2016

System State dumped to trace file

Wed Mar 02 16:32:48 CST 2016

System State dumped to trace file /u01/app/oracle/admin/xxx/udump/scm2_ora_13598.trc

$ORACLE_BASE/admin/ORACLE_SID/udump/ 下找到對應的trc檔案,如下所示,你會看到類似下面的一些資訊.

 

systemstate dump有多個層級:

2: dump (不包括lock element)

10: dump

11: dump + global cache of RAC

256: short stack (函數堆棧)

258: 256+2 -->short stack +dump(不包括lock element)

266: 256+10 -->short stack+ dump

267: 256+11 -->short stack+ dump + global cache of RAC

level 11和 267會 dump global cache, 會產生較大的trace 檔案,一般情況下不推薦。一般情況下,如果進程不是太多,推薦用266,因為這樣可以dump出來進程的函數堆棧,可以用來分析進程在執行什麼操作。但是產生short stack比較耗時,如果進程非常多,比如2000個進程,那麼可能耗時30分鐘以上。這種情況下,可以產生level 10 或者 level 258, level 258 比 level 10會多收集short short stack, 但比level 10少收集一些lock element data.

 

雖然通過system state dump收集了進程的相關,但是如何有效解讀相關資訊,並診斷分析問題是一個不小的難題和挑戰!

 

參考資料:

https://blogs.oracle.com/Database4CN/entry/systemstate_dump_%E4%BB%8B%E7%BB%8D

http://tech.e2sn.com/oracle/troubleshooting/hang/how-to-log-on-even-when-sysdba-can-t-do-so

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=352993211736965&parent=DOCUMENT&sourceId=68738.1&id=452358.1&_afrWindowMode=0&_adf.ctrl-state=z7hwh19s9_319

oracle systemstate dump介紹

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.