資料庫效能最佳化分析案例---解決SQL語句過度消耗CPU問題

來源:互聯網
上載者:User
解決|資料|資料庫|問題|效能|最佳化|語句
問題描述:
10月25日上午濱州網通的工程師報告OSS應用系統運行緩慢,具體操作是通過OSS系統查詢話單,很長時間才能返回結果,嚴重影響了客戶的正常使用。

問題處理:
1.登陸資料庫主機,用sar命令看到idle的值持續為0,CPU的資源已經耗盡:

bz_db1# sar 2 4

SunOS kest 5.8 Generic_108528-19 sun4u    10/26/04

10:56:46    %usr    %sys    %wio   %idle
10:56:48       1       4      95       0
10:56:50       1       5      94       0
10:56:52       0       6      93       0
10:56:54       1       6      93       0

Average        1       5      94      0


2.使用TOP命令看到有兩個明顯佔用CPU利用率過高的進程,以下是top命令的結果:

bz_db1# top

last pid:  1664;load averages:  3.26,  3.24,  3.69                       
159 processes: 152 sleeping, 2 running, 2 zombie, 1 stopped, 2 on cpu
CPU states:  1.5% idle, 72.5% user, 17.9% kernel,  8.0% iowait,  0.0% swap
Memory: 2.0G real, 233M free, 2.0G swap in use, 3.4G swap free

   PID USERNAME THR PR NCE  SIZE   RES STATE   TIME FLTS    CPU COMMAND
 27420 oracle     1 10   0  1.3G  1.2G cpu01  22.9H    2 31.94% oracle
 27418 oracle     1 10   0  1.3G  1.2G run    23.0H    6 26.86% oracle
  5943 oracle     1 59   0  1.3G  1.2G sleep  25:26   37  4.92% oracle
  6295 oracle     1 55   0  1.3G  1.2G run    25:14   74  4.90% oracle
  7778 oracle     1 43   0  1.3G  1.2G sleep  11:43  110  4.86% oracle
 13270 oracle     1 59   0  1.3G  1.2G sleep 210.6H    0  0.96% oracle
 13056 oracle     1 48   0  1.3G  1.2G sleep 303:30    0  0.37% oracle
 10653 root       1 58   0 2560K 1624K cpu00   0:00    0  0.32% top
 18827 oracle     1 58   0  1.3G  1.2G sleep  18.4H    0  0.31% oracle
 12748 oracle   258 58   0  1.3G  1.2G sleep 555:14    0  0.21% oracle
 10634 oracle     1 59   0  1.3G  1.2G sleep   0:01    0  0.21% oracle
 28458 oracle     1 58   0  1.3G  1.2G sleep 535:02    0  0.18% oracle
 13075 oracle     1 59   0  1.3G  1.2G sleep 326:33    0  0.15% oracle
 13173 oracle     1 58   0  1.3G  1.2G sleep 593:07    0  0.13% oracle
  4927 oracle     1 59   0  1.3G  1.2G sleep  33.4H    0  0.11% oracle

可以看到這兩個進程號分別是27420和27418.

3.捕獲佔用CPU利用率過高的SQL語句:

以下用到了我總結的SQL語句:

SQL>set line 240
SQL>set verify off
SQL>column sid format 999
SQL>column pid format 999
SQL>column S_# format 999
SQL>column username format A9 heading "ORA User"
SQL>column program  format a29
SQL>column SQL format a60
SQL>COLUMN OSname format a9 Heading "OS User"
SQL>SELECT P.pid pid,S.sid sid,P.spid spid,S.username username,
S.osuser osname,P.serial# S_#,P.terminal,P.program  program,
P.background,S.status,RTRIM(SUBSTR(a.sql_text, 1, 80))  SQL
FROM v$process P, v$session S,v$sqlarea A WHERE P.addr = s.paddr
AND S.sql_address = a.address (+)  AND P.spid LIKE '%&1%';

Enter value for 1: 27420(注意這裡應輸入佔用CPU最高的進程對應的PID)

得到以下SQL語句:

Select nvl(sum(LOCALCHARGE),0),nvl(sum(usage),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20041016
and LOCALCHARGE>0 and caller like '0543886%';

27418進程對應的SQL語句如下:
select nvl(sum(LOCALCHARGE),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20041016 and caller like '0543888%';


4.使用相關使用者串連到資料庫,檢查其執行計畫:
SQL>connect wacos/oss
Connected.

SQL>@?/rdbms/admin/utlxplan.sql
Table created.

SQL>set autotrace on

SQL>set timing on

SQL>Select nvl(sum(LOCALCHARGE),0),nvl(sum(usage),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20041016
 and LOCALCHARGE>0 and caller like '0543886%';

NVL(SUM(LOCALCHARGE),0) NVL(SUM(USAGE),0)
----------------------- -----------------
                      0                 0

Elapsed: 00:02:56.37

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=13435 Card=1 Bytes=5
          3)

   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       TABLE ACCESS (FULL) OF 'LOCALUSAGE' (Cost=13435 Card=1
          81 Bytes=9593)

Statistics
----------------------------------------------------------
        258  recursive calls
          0  db block gets
      88739  consistent gets
      15705  physical reads
          0  redo size
        580  bytes sent via SQL*Net to client
        651  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          8  sorts (memory)
          0  sorts (disk)
          1  rows processed

發現對localusage表做了全表掃描,什麼記錄也沒有返回居然用了2分多鐘。

SQL> select nvl(sum(LOCALCHARGE),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20040816 and caller like '0543888%';

NVL(SUM(LOCALCHARGE),0)
-----------------------
                   27.6

Elapsed: 00:03:56.46

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=13435 Card=1 Bytes=4
          0)

   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       TABLE ACCESS (FULL) OF 'LOCALUSAGE' (Cost=13435 Card=3
          615 Bytes=144600)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      88588  consistent gets
      15615  physical reads
          0  redo size
        507  bytes sent via SQL*Net to client
        651  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

這個SQL語句有結果返回,發現也是對localusage表做了全表掃描,但速度也很慢,用了3分多鐘。

SQL> select count(*) from localusage;

  COUNT(*)
----------
   5793776

該表有579萬多條記錄,資料量很大,全表掃描已經不再適合。

5.檢查該表的類型:

SQL> SELECT INDEX_NAME, TABLE_NAME, STATUS, PARTITIONED FROM USER_INDEXES WHERE TABLE_NAME='LOCALUSAGE';

INDEX_NAME                     TABLE_NAME                     STATUS   PAR
------------------------------ ------------------------------ -------- ---
I_LOCALUSAGE_SID               LOCALUSAGE                     N/A      YES
UI_LOCALUSAGE_ST_SEQ           LOCALUSAGE                     N/A      YES

SQL> SELECT index_name,table_name,locality FROM user_part_indexes where table_name='LOCALUSAGE';

INDEX_NAME                     TABLE_NAME                     LOCALI
------------------------------ ------------------------------ ------
I_LOCALUSAGE_SID               LOCALUSAGE                     LOCAL
UI_LOCALUSAGE_ST_SEQ           LOCALUSAGE                     LOCAL

發現該表是分區表,並在SERVICEID,STARTIME和CDRSEQUENCE列上建立了分區索引,索引類型是local索引。

6.查看分區索引的索引索引值:

SQL> select INDEX_NAME,COLUMN_NAME,INDEX_OWNER from dba_ind_columns where TABLE_NAME='LOCALUSAGE';

INDEX_NAME           COLUMN_NAME          INDEX_OWNER
-------------------- -------------------- ------------------------------
I_LOCALUSAGE_SID     SERVICEID            WACOS
UI_LOCALUSAGE_ST_SEQ STARTTIME            WACOS
UI_LOCALUSAGE_ST_SEQ CDRSEQUENCE          WACOS

發現在endtime和caller列上都沒有建立索引,這也是導致SQL語句做全表掃描的最終原因。

7.決定建立新的分區索引以消除全表掃描:

(1).首先查看localusage表分區情況:

SQL> select PARTITION_NAME,tablespace_name from user_tab_partitions where table_name='LOCALUSAGE';

PARTITION_NAME                 TABLESPACE_NAME
------------------------------ ------------------------------
LOCALUSAGE_200312              WACOS
LOCALUSAGE_200401              WACOS
LOCALUSAGE_200402              WACOS
LOCALUSAGE_200404              WACOS
LOCALUSAGE_200405              WACOS
LOCALUSAGE_200406              WACOS
LOCALUSAGE_200407              WACOS
LOCALUSAGE_200409              WACOS
LOCALUSAGE_200410              WACOS
LOCALUSAGE_200411              WACOS
LOCALUSAGE_200403              WACOS
LOCALUSAGE_200408              WACOS
LOCALUSAGE_200412              WACOS

13 rows selected.

(2).在caller列上建立local分區索引:
SQL>set timing on
SQL>create index I_LOCALUSAGE_CALLER on localusage(caller)
LOCAL
(      
        PARTITION LOCALUSAGE_200312,
 PARTITION LOCALUSAGE_200401,
 PARTITION LOCALUSAGE_200402,
 PARTITION LOCALUSAGE_200404,
 PARTITION LOCALUSAGE_200405,
 PARTITION LOCALUSAGE_200406,
 PARTITION LOCALUSAGE_200407,
 PARTITION LOCALUSAGE_200409,
 PARTITION LOCALUSAGE_200410,
 PARTITION LOCALUSAGE_200411,
 PARTITION LOCALUSAGE_200403,
 PARTITION LOCALUSAGE_200408,
 PARTITION LOCALUSAGE_200412
)
TABLESPACE wacos
STORAGE(
 INITIAL 6553600
 NEXT 6553600
 MAXEXTENTS unlimited
 PCTINCREASE 0)
 PCTFREE 5
 NOLOGGING;

Index created.

Elapsed: 00:06:27.90  (由於資料量比較大,耗時6分鐘)

8.再次查看執行計畫:
SQL>Select nvl(sum(LOCALCHARGE),0),nvl(sum(usage),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20041016
and LOCALCHARGE>0  and caller like '0543886%';

NVL(SUM(LOCALCHARGE),0) NVL(SUM(USAGE),0)
----------------------- -----------------
                      0                 0

Elapsed: 00:00:03.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=22 Card=1 Bytes=53)
   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'LOCALUSAGE' (Cost=22 Card=181 Bytes=9593)
   4    3         INDEX (RANGE SCAN) OF 'I_LOCALUSAGE_CALLER' (NON-UNIQUE) (Cost=14 Card=65063)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      16813  consistent gets
        569  physical reads
          0  redo size
        580  bytes sent via SQL*Net to client
        651  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

這次走了索引後速度明顯快多了,用了3秒鐘就返回了結果。

SQL>select nvl(sum(LOCALCHARGE),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20040816 and caller like '0543888%';

NVL(SUM(LOCALCHARGE),0)
-----------------------
                   27.6

Elapsed: 00:00:24.73

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=22 Card=1 Bytes=40)
   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'LOCALUSAGE' (Cost=22 Card=3615 Bytes=144600)
   4    3         INDEX (RANGE SCAN) OF 'I_LOCALUSAGE_CALLER' (NON-UNIQUE) (Cost=14 Card=65063)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     129336  consistent gets
       7241  physical reads
          0  redo size
        507  bytes sent via SQL*Net to client
        651  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

這個SQL語句走了索引,用了24秒鐘返回結果,效能明顯提高了很多。



相關文章

Beyond APAC's No.1 Cloud

19.6% IaaS Market Share in Asia Pacific - Gartner IT Service report, 2018

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。