oracle 表串連,oracle串連
一. sort merge joins串連(排序合并串連) 原理
指的是兩個表串連時, 通過串連列先分別排序後, 再通過合併作業來得到最後返回的結果集的方法.
假如表 T1 和 T2 的串連方式是排序合并串連, oracle 執行步驟如下:
(1) 根據 sql 語句中的謂詞條件(如果有) 訪問 T1 表, 得到一個過濾的結果集, 然後按照 T1 中的串連列對結果集進行排序
(2) 根據 sql 語句中的謂詞條件(如果有) 訪問 T2 表, 得到一個過濾的結果集, 然後按照 T2 中的串連列對結果集進行排序
(3) 將 1 和 2 的結果集合并起來, 對記錄進行匹配得到最後的結果集.
通常來說, sort merge joins串連(排序合并串連) 使用並不廣泛, 因為在大部分情況下使用 nested loops 或者 hash joins 都能獲得比它更好的執行效率, 但是由於 hash joins 只能用於等值串連條件, 所以在非等值條件串連以及非 like 非 "<>" 情況下, 如果串連列上已經有排序, 使用 sort merge joins串連方式能獲得比較好的執行效率
二. sort merge joins串連(排序合并串連) 特性
(1) 驅動表最多訪問一次, 如果獨立的謂詞條件(不涉及驅動表欄位的函數或者運算式等)不成立, 則不用再去訪問驅動表
(2) 被驅動表最多訪問一次. 如果驅動表沒有記錄, 被驅動表不用訪問
(3) 驅動表的選擇對於執行成本以及效能沒有太大的影響
(4) 支援大部分的串連條件, 比如 ">" "<" ">=" "<=", 不支援 like, "<>"
構造實驗資料
SQL> CREATE TABLE t1 ( 2 id NUMBER NOT NULL, 3 n NUMBER, 4 pad VARCHAR2(4000), 5 CONSTRAINT t1_pk PRIMARY KEY(id) 6 );Table created.SQL> CREATE TABLE t2 ( 2 id NUMBER NOT NULL, 3 t1_id NUMBER NOT NULL, 4 n NUMBER, 5 pad VARCHAR2(4000), 6 CONSTRAINT t2_pk PRIMARY KEY(id), 7 CONSTRAINT t2_t1_fk FOREIGN KEY (t1_id) REFERENCES t1 8 );Table created.SQL> CREATE TABLE t3 ( 2 id NUMBER NOT NULL, 3 t2_id NUMBER NOT NULL, 4 n NUMBER, 5 pad VARCHAR2(4000), 6 CONSTRAINT t3_pk PRIMARY KEY(id), 7 CONSTRAINT t3_t2_fk FOREIGN KEY (t2_id) REFERENCES t2 8 );Table created.SQL> CREATE TABLE t4 ( 2 id NUMBER NOT NULL, 3 t3_id NUMBER NOT NULL, 4 n NUMBER, 5 pad VARCHAR2(4000), 6 CONSTRAINT t4_pk PRIMARY KEY(id), 7 CONSTRAINT t4_t3_fk FOREIGN KEY (t3_id) REFERENCES t3 8 );Table created.SQL> execute dbms_random.seed(0)PL/SQL procedure successfully completed.SQL> INSERT INTO t1 SELECT rownum, rownum, dbms_random.string('a',50) FROM dual CONNECT BY level <= 10 ORDER BY dbms_random.random;10 rows created.SQL> INSERT INTO t2 SELECT 100+rownum, t1.id, 100+rownum, t1.pad FROM t1, t1 dummy ORDER BY dbms_random.random;100 rows created.SQL> INSERT INTO t3 SELECT 1000+rownum, t2.id, 1000+rownum, t2.pad FROM t2, t1 dummy ORDER BY dbms_random.random;1000 rows created.SQL> INSERT INTO t4 SELECT 10000+rownum, t3.id, 10000+rownum, t3.pad FROM t3, t1 dummy ORDER BY dbms_random.random;10000 rows created.SQL> COMMIT;Commit complete.
使用 hint 讓執行計畫以 T3 作為驅動表
SQL> select /*+ leading(t3) use_merge(t4) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100;10 rows selected.SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID g0rdyg9hdh9m0, child number 0-------------------------------------select /*+ leading(t3) use_merge(t4) */ * from t3, t4 where t3.id =t4.t3_id and t3.n = 1100Plan hash value: 3831111046-----------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |-----------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.02 | 119 | | | || 1 | MERGE JOIN | | 1 | 10 | 10 |00:00:00.02 | 119 | | | || 2 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 15 | 2048 | 2048 | 2048 (0)||* 3 | TABLE ACCESS FULL| T3 | 1 | 1 | 1 |00:00:00.01 | 15 | | | ||* 4 | SORT JOIN | | 1 | 10000 | 10 |00:00:00.02 | 104 | 974K| 535K| 865K (0)|| 5 | TABLE ACCESS FULL| T4 | 1 | 10000 | 10000 |00:00:00.01 | 104 | | | |-----------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 3 - filter("T3"."N"=1100) 4 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID")
使用 hint 讓執行計畫以 T4 作為驅動表
SQL> select /*+ leading(t4) use_merge(t3) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100;10 rows selected.SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID gxuwn06y1c1az, child number 0-------------------------------------select /*+ leading(t4) use_merge(t3) */ * from t3, t4 where t3.id =t4.t3_id and t3.n = 1100Plan hash value: 875334572-----------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |-----------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.04 | 119 | | | || 1 | MERGE JOIN | | 1 | 10 | 10 |00:00:00.04 | 119 | | | || 2 | SORT JOIN | | 1 | 10000 | 1001 |00:00:00.04 | 104 | 974K| 535K| 865K (0)|| 3 | TABLE ACCESS FULL| T4 | 1 | 10000 | 10000 |00:00:00.01 | 104 | | | ||* 4 | SORT JOIN | | 1001 | 1 | 10 |00:00:00.01 | 15 | 2048 | 2048 | 2048 (0)||* 5 | TABLE ACCESS FULL| T3 | 1 | 1 | 1 |00:00:00.01 | 15 | | | |-----------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 4 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID") 5 - filter("T3"."N"=1100)
從返回的執行計畫結果中我們可以看到:
1. 以 T3 為驅動表和以 T4 為驅動表, 兩者的 cost (A-Time) 和 buffers 都差不多
2. 以 T3 為驅動表時, T3 訪問一次, T4 也是訪問一次; 以 T4 為驅動表時, T4 訪問一次, T3 也是訪問一次
3. 需要排序, 如果 PGA 空間重足時在 PGA 中排序, 不如果不足則交換到磁碟上排序
另外, 在執行計畫中有幾個統計資訊列 0Mem, 1Mem, Use_Mem 需要介紹一下
- 0Mem 指的是預計在 PGA 中排序需要的記憶體大小
- 1Mem 指的是當記憶體大小(PGA)不足以進行排序, 預計將資料一次交換到磁碟空間的記憶體大小
- Used-Mem 指的是執行時實際使用的記憶體大小, 其中括弧中的數字代表進行磁碟交換的次數, 0 代表沒有進行磁碟交換
三. sort merge joins串連(排序合并串連) 最佳化
SQL> select /*+ leading(t3) use_merge(t4) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034;SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID bg9h60c7ak3ud, child number 0-------------------------------------select /*+ leading(t3) use_merge(t4) */ * from t3, t4 where t3.id =t4.t3_id and t3.n = 1100 and t4.n = 10034Plan hash value: 3831111046-----------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |-----------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 119 | | | || 1 | MERGE JOIN | | 1 | 1 | 1 |00:00:00.01 | 119 | | | || 2 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 15 | 2048 | 2048 | 2048 (0)||* 3 | TABLE ACCESS FULL| T3 | 1 | 1 | 1 |00:00:00.01 | 15 | | | ||* 4 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 104 | 2048 | 2048 | 2048 (0)||* 5 | TABLE ACCESS FULL| T4 | 1 | 1 | 1 |00:00:00.01 | 104 | | | |-----------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 3 - filter("T3"."N"=1100) 4 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID") 5 - filter("T4"."N"=10034)SQL> create index t4_n on t4(n);Index created.SQL> select /*+ leading(t3) use_merge(t4) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034;SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID bg9h60c7ak3ud, child number 0-------------------------------------select /*+ leading(t3) use_merge(t4) */ * from t3, t4 where t3.id =t4.t3_id and t3.n = 1100 and t4.n = 10034Plan hash value: 1501658231------------------------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |------------------------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 18 | 1 | | | || 1 | MERGE JOIN | | 1 | 1 | 1 |00:00:00.01 | 18 | 1 | | | || 2 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 15 | 0 | 2048 | 2048 | 2048 (0)||* 3 | TABLE ACCESS FULL | T3 | 1 | 1 | 1 |00:00:00.01 | 15 | 0 | | | ||* 4 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 3 | 1 | 2048 | 2048 | 2048 (0)|| 5 | TABLE ACCESS BY INDEX ROWID| T4 | 1 | 1 | 1 |00:00:00.01 | 3 | 1 | | | ||* 6 | INDEX RANGE SCAN | T4_N | 1 | 1 | 1 |00:00:00.01 | 2 | 1 | | | |------------------------------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 3 - filter("T3"."N"=1100) 4 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID") 6 - access("T4"."N"=10034)SQL> create index t3_n on t3(n);Index created.SQL> select /*+ leading(t3) use_merge(t4) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034;SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID bg9h60c7ak3ud, child number 0-------------------------------------select /*+ leading(t3) use_merge(t4) */ * from t3, t4 where t3.id =t4.t3_id and t3.n = 1100 and t4.n = 10034Plan hash value: 1827980052------------------------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |------------------------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 6 | 1 | | | || 1 | MERGE JOIN | | 1 | 1 | 1 |00:00:00.01 | 6 | 1 | | | || 2 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 3 | 1 | 2048 | 2048 | 2048 (0)|| 3 | TABLE ACCESS BY INDEX ROWID| T3 | 1 | 1 | 1 |00:00:00.01 | 3 | 1 | | | ||* 4 | INDEX RANGE SCAN | T3_N | 1 | 1 | 1 |00:00:00.01 | 2 | 1 | | | ||* 5 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | 2048 | 2048 | 2048 (0)|| 6 | TABLE ACCESS BY INDEX ROWID| T4 | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | | | ||* 7 | INDEX RANGE SCAN | T4_N | 1 | 1 | 1 |00:00:00.01 | 2 | 0 | | | |------------------------------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 4 - access("T3"."N"=1100) 5 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID") 7 - access("T4"."N"=10034)
從上面的執行計畫中可以看出, 全表掃描後最後使用的 buffer 為 119, 在一個表上建立索引使用索引範圍掃描後 buffer 為 18, 在兩個表上建立的索引使用索引範圍掃描後 buffer 為 6. 由此可以見, 在表的謂詞條件上如果有索引的話, 將會提高執行效率.
此外, 由於 sort merge joins 需要先在 PGA 中進行排序,, 如果 PGA 空間不足, 就會將資料交換到磁碟上進行排序。由於, 磁碟相對於記憶體來說是慢速裝置,因此在磁碟上排序會比在記憶體上排序慢, 另外排序排序消耗的時間還需要加上資料在記憶體和磁碟上傳輸的時間,因此儘可能減少磁碟排序的次數也就會提高執行效率, 有兩種方法會減少磁碟排序:
1. 增大 PGA 的大小, 如果是 oracle 10g,需要增加參數 pga_aggregate_target 的大小,如果是 oracle 11g,則增加 memory_target 的大小
2. 減少排序的資料量, 一些不需要的欄位就不要寫在 select 後面
四. 小結
遇到 sql 調優時,如果執行計畫顯示表的串連方式是 sort merge join:
首先,看看 sql 語句是不是表的串連方式有沒有可能轉換為 hash join(等值串連條件)
其次,只能使用 sort merge join 時看看錶的謂詞條件上是不是有索引
最後,看看執行計畫排序佔用的記憶體大小是不是在磁碟上有排序, 是不是能夠避免在磁碟上排序
參考: <<基於 oracle 的 sql 最佳化>>
<<收穫, 不止 oracle>>
<<Troubleshooting oracle performance>>