資料庫死結嚴重引發中介軟體串連池滿故障診斷,中介軟體池滿
1、故障現象
前台系統應用無法登陸,weblogic伺服器應用程式的運行狀態顯示為overload,線程串連池滿。
2、故障原因分析
根據上述故障現象,分析基礎可以確定為是Weblogic有過多的串連連到資料庫,因為會話一直保持未釋放,將串連池佔滿後,導致新的串連無法請求到串連池。在此關鍵是分析為什麼會有大量的會話佔滿串連池而不釋放。
3、問題分析過程3.1 session數超過1000
|
Snap Id |
Snap Time |
Sessions |
Cursors/Session |
Begin Snap: |
38194 |
2014/12/10 8:00 |
1010 |
13.8 |
End Snap: |
38196 |
2014/12/10 10:00 |
1004 |
14.4 |
Elapsed: |
|
119.65 (mins) |
|
|
DB Time: |
|
20,686.23 (mins) |
|
|
3.2 每秒執行次數超過15000多次
|
Per Second |
Per Transaction |
Redo size: |
18,144,642.95 |
459,254.26 |
Logical reads: |
199,155.23 |
5,040.77 |
Block changes: |
100,209.94 |
2,536.39 |
Physical reads: |
5,241.51 |
132.67 |
Physical writes: |
2,182.48 |
55.24 |
User calls: |
16,134.45 |
408.37 |
Parses: |
177.18 |
4.48 |
Hard parses: |
69.02 |
1.75 |
Sorts: |
106.97 |
2.71 |
Logons: |
0.58 |
0.01 |
Executes: |
15,955.55 |
403.85 |
Transactions: |
39.51 |
|
3.3 94%的等待時間都是花在“enq: TX -row lock contention”
Event |
Waits |
Time(s) |
Avg Wait(ms) |
% Total Call Time |
Wait Class |
enq: TX - row lock contention |
2,419,702 |
1,174,344 |
485 |
94.6 |
Application |
CPU time |
|
28,795 |
|
2.3 |
|
db file sequential read |
2,507,054 |
13,011 |
5 |
1 |
User I/O |
db file scattered read |
2,358,449 |
11,234 |
5 |
0.9 |
User I/O |
db file parallel write |
475,780 |
7,760 |
16 |
0.6 |
System I/O |
3.4 消耗時間最長的SQL語句
Elapsed Time (s) |
CPU Time (s) |
Executions |
Elap per Exec (s) |
% Total DB Time |
SQL Id |
SQL Text |
744,943 |
51 |
226 |
3296.21 |
60.02 |
2d0msz9tv9z5c |
DELETE FROM PUB_XXLINE WHERE U... |
429,987 |
29 |
483 |
890.24 |
34.64 |
1yucnh7p2mjdk |
DELETE FROM PUB_XXLINE WHERE S... |
2d0msz9tv9z5c文本如下:
DELETE FROM PUB_XXLINE WHERE USER_ID=:1
1yucnh7p2mjdk文本如下:
DELETE FROM PUB_XXLINE WHERE SESSION_ID=:1
以上兩條SQL語句產生嚴重行爭用,並且在會話級產生死結
3.5 資料庫內驗證會話死結
(1)查看被阻塞的sid:800會話是被哪個會話所阻塞
select sid,serial#,sql_id,status,blocking_session from v$sessionwhere sid=800;
sid |
serial# |
sql_id |
status |
blocking_session |
800 |
28777 |
2d0msz9tv9z5c |
ACTIVE |
347 |
從上面看來,sid:800執行的SQL語句的ID是2d0msz9tv9z5c,它被session id 347所阻塞
(2)查看session id 347是否被阻塞
select sid,serial#,sql_id,status,blocking_session from v$sessionwhere sid=347;
sid |
serial# |
sql_id |
status |
blocking_session |
347 |
15367 |
2d0msz9tv9z5c |
ACTIVE |
800 |
Session 347執行的語句的ID也是2d0msz9tv9z5c,它被sid 800所阻塞,在這裡已經形成死結。
3.6 資料庫日誌中發現大量死結資訊
Wed Dec 10 09:55:05 2014 Global Enqueue Services Deadlock detected. More info in file /oracle/admin/XXTH/bdump/XXth1_lmd0_1541770.trc. Wed Dec 10 09:55:36 2014 Thread 1 advanced to log sequence 493665 (LGWR switch) Current log# 16 seq# 493665 mem# 0: +REDOGROUP/XXth/onlinelog/group_16.263.837169415 Wed Dec 10 09:56:04 2014 Thread 1 advanced to log sequence 493666 (LGWR switch) Current log# 14 seq# 493666 mem# 0: +REDOGROUP/XXth/onlinelog/group_14.261.837169357 Wed Dec 10 09:56:05 2014 Global Enqueue Services Deadlock detected. More info in file /oracle/admin/XXTH/bdump/XXth1_lmd0_1541770.trc. |
4、問題原因診斷總結
語句DELETEFROM PUB_XXLINE WHERE USER_ID=:1每次操作的資料行有多行,多個會話同時執行該語句,在資料庫內形成死結,並且玥塞單次刪除一行資料的語句DELETEFROM PUB_XXLINE WHERE SESSION_ID=:1從而使Weblogic(中介軟體)的串連池消耗殆盡。無法新增串連。
本文作者:黎俊傑(網名:踩點),從事”系統架構、作業系統、存放裝置、資料庫、中介軟體、應用程式“六個層面系統性的效能最佳化工作
歡迎加入 系統效能最佳化專業群,共同探討效能最佳化技術。群號:258187244