Oracle table join-nestedloop nested loop join

Source: Internet
Author: User

Oracle table join-nestedloop nested loop join
I. nested loop Principle

A nested loop connection is a two-layer nested loop that matches two tables in sequence and returns the table join method of the returned result set.

In the following SQL statement, tables T1 and T2 are connected through nested loops, and T1 is the driver table.

select *from T1, T2where T1.id = T2.id and T1.name = 'David';
Translate the preceding SQL statement into a pseudo code as follows :.
for each row in (select * from T1 where name = 'David') loopfor (select * from T2 where T2.id = outer.id) loopIf match then pass the row on to the next stepIf no match then discard the rowend loopend loop

Specifically, if the preceding SQL statement executes nested loops, the actual execution process should be as follows:
(1) first, oracle will determine which table is the driving table based on certain rules (calculated based on the cost of statistics or mandatory hint, which table is the driving table (assuming T1 is the driving table)
(2) query the driver table "select * from T1 where name = 'David'" and obtain the driver result set Q1.
(3) traverse the drive result set Q1 and the driven table T2, and retrieve a record from the drive result set Q1, next, traverse T2 and judge whether there is a matching record in T2 according to the connection condition T2.id = T1.id. If yes, the record is retained. If no matching exists, this row is ignored, then retrieve the next record from Q1 and traverse T2 for matching until all records in Q1 are obtained.
Ii. nested loop features

Nested loop connections have the following features:

(1) generally, the driver table is accessed only once in an SQL statement and is accessed multiple times by the driver table.
(2) You do not have to wait for the processing to complete all rows before returning some processed data.
(3) Creating indexes on restricted conditions and connection condition columns can improve execution efficiency.
(4) supports all types of connections (equijoin, non-equijoin, like, etc)

Construct Test Data
SQL> CREATE TABLE t1 (  2    id NUMBER NOT NULL,  3    n NUMBER,  4    pad VARCHAR2(4000),  5    CONSTRAINT t1_pk PRIMARY KEY(id)  6  );Table created.SQL> CREATE TABLE t2 (  2    id NUMBER NOT NULL,  3    t1_id NUMBER NOT NULL,  4    n NUMBER,  5    pad VARCHAR2(4000),  6    CONSTRAINT t2_pk PRIMARY KEY(id),  7    CONSTRAINT t2_t1_fk FOREIGN KEY (t1_id) REFERENCES t1  8  );Table created.SQL> CREATE TABLE t3 (  2    id NUMBER NOT NULL,  3    t2_id NUMBER NOT NULL,  4    n NUMBER,  5    pad VARCHAR2(4000),  6    CONSTRAINT t3_pk PRIMARY KEY(id),  7    CONSTRAINT t3_t2_fk FOREIGN KEY (t2_id) REFERENCES t2  8  );Table created.SQL> CREATE TABLE t4 (  2    id NUMBER NOT NULL,  3    t3_id NUMBER NOT NULL,  4    n NUMBER,  5    pad VARCHAR2(4000),  6    CONSTRAINT t4_pk PRIMARY KEY(id),  7    CONSTRAINT t4_t3_fk FOREIGN KEY (t3_id) REFERENCES t3  8  );Table created.SQL> execute dbms_random.seed(0)PL/SQL procedure successfully completed.SQL> INSERT INTO t1 SELECT rownum, rownum, dbms_random.string('a',50) FROM dual CONNECT BY level <= 10 ORDER BY dbms_random.random;10 rows created.SQL> INSERT INTO t2 SELECT 100+rownum, t1.id, 100+rownum, t1.pad FROM t1, t1 dummy ORDER BY dbms_random.random;100 rows created.SQL> INSERT INTO t3 SELECT 1000+rownum, t2.id, 1000+rownum, t2.pad FROM t2, t1 dummy ORDER BY dbms_random.random;1000 rows created.SQL> INSERT INTO t4 SELECT 10000+rownum, t3.id, 10000+rownum, t3.pad FROM t3, t1 dummy ORDER BY dbms_random.random;10000 rows created.SQL> COMMIT;Commit complete.

Use hint to connect SQL statements through the nested loop, and specify t3 as the driving table
SQL> select /*+ leading(t3) use_nl(t4) */ * from t3, t4  2  where t3.id = t4.t3_id and t3.n = 1100;10 rows selected.SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID  89hnfwqakjghg, child number 0-------------------------------------select /*+ leading(t3) use_nl(t4) */ * from t3, t4 where t3.id =t4.t3_id and t3.n = 1100Plan hash value: 1907878852-------------------------------------------------------------------------------------| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |-------------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |      |      1 |        |     10 |00:00:00.01 |     121 ||   1 |  NESTED LOOPS      |      |      1 |     10 |     10 |00:00:00.01 |     121 ||*  2 |   TABLE ACCESS FULL| T3   |      1 |      1 |      1 |00:00:00.01 |      16 ||*  3 |   TABLE ACCESS FULL| T4   |      1 |     10 |     10 |00:00:00.01 |     105 |-------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("T3"."N"=1100)   3 - filter("T3"."ID"="T4"."T3_ID")

In the execution plan, we can see that the driver table T3 is accessed once, because the driver table has the predicate condition t3.n = 1100, and the number of records in the result set is 1 after the predicate condition is executed, so T4 only accesses the starts column once)

Use hint to connect SQL statements through the nested loop and specify t4 as the driver table

SQL> select /*+ leading(t4) use_nl(t3) full(t4) full(t3) */ * from t3, t4  2  where t3.id = t4.t3_id and t3.n = 1100;SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID  0yxm1muqwrfq2, child number 0-------------------------------------select /*+ leading(t4) use_nl(t3) full(t4) full(t3) */ * from t3, t4where t3.id = t4.t3_id and t3.n = 1100Plan hash value: 3886808168-------------------------------------------------------------------------------------| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |-------------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |      |      1 |        |     10 |00:00:00.25 |     150K||   1 |  NESTED LOOPS      |      |      1 |     10 |     10 |00:00:00.25 |     150K||   2 |   TABLE ACCESS FULL| T4   |      1 |  10000 |  10000 |00:00:00.01 |     105 ||*  3 |   TABLE ACCESS FULL| T3   |  10000 |      1 |     10 |00:00:00.21 |     150K|-------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   3 - filter(("T3"."N"=1100 AND "T3"."ID"="T4"."T3_ID"))
In the execution plan, we can see that the driver table T4 is accessed once. because the number of records in the T4 result set in the driver table is 10000, T4 accessed 10000 times, buffers and A-time (actual execution time) are relatively high.


Iii. nested loop Optimization

Create an index on the connection column (t3_id column of Table T4) of the drive table of the nested loop
SQL> CREATE INDEX t4_t3_id ON t4(t3_id);Index created.SQL> select /*+ leading(t3) use_nl(t4) */ * from t3, t4  2  where t3.id = t4.t3_id and t3.n = 1100;10 rows selected.SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID  89hnfwqakjghg, child number 0-------------------------------------select /*+ leading(t3) use_nl(t4) */ * from t3, t4 where t3.id =t4.t3_id and t3.n = 1100Plan hash value: 2039660043------------------------------------------------------------------------------------------------------------| Id  | Operation                    | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |------------------------------------------------------------------------------------------------------------|   0 | SELECT STATEMENT             |          |      1 |        |     10 |00:00:00.01 |      29 |   1 ||   1 |  NESTED LOOPS                |          |      1 |        |     10 |00:00:00.01 |      29 |   1 ||   2 |   NESTED LOOPS               |          |      1 |     10 |     10 |00:00:00.01 |      19 |   1 ||*  3 |    TABLE ACCESS FULL         | T3       |      1 |      1 |      1 |00:00:00.01 |      16 |   0 ||*  4 |    INDEX RANGE SCAN          | T4_T3_ID |      1 |     10 |     10 |00:00:00.01 |       3 |   1 ||   5 |   TABLE ACCESS BY INDEX ROWID| T4       |     10 |     10 |     10 |00:00:00.01 |      10 |   0 |------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   3 - filter("T3"."N"=1100)   4 - access("T3"."ID"="T4"."T3_ID")
In the execution plan, we can see that the buffer dropped from 121 to 29 after the index is added to the connection column of the drive table.

Add an index to the predicate condition column (n column in table T3) of the driver table
SQL> create index t3_n on t3(n);Index created.SQL> select /*+ leading(t3) use_nl(t4) */ * from t3, t4  2  where t3.id = t4.t3_id and t3.n = 1100;10 rows selected.SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID  89hnfwqakjghg, child number 0-------------------------------------select /*+ leading(t3) use_nl(t4) */ * from t3, t4 where t3.id =t4.t3_id and t3.n = 1100Plan hash value: 2304842513-------------------------------------------------------------------------------------------------------------| Id  | Operation                     | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |-------------------------------------------------------------------------------------------------------------|   0 | SELECT STATEMENT              |          |      1 |        |     10 |00:00:00.01 |      17 |   1 ||   1 |  NESTED LOOPS                 |          |      1 |        |     10 |00:00:00.01 |      17 |   1 ||   2 |   NESTED LOOPS                |          |      1 |     10 |     10 |00:00:00.01 |       7 |   1 ||   3 |    TABLE ACCESS BY INDEX ROWID| T3       |      1 |      1 |      1 |00:00:00.01 |       4 |   1 ||*  4 |     INDEX RANGE SCAN          | T3_N     |      1 |      1 |      1 |00:00:00.01 |       3 |   1 ||*  5 |    INDEX RANGE SCAN           | T4_T3_ID |      1 |     10 |     10 |00:00:00.01 |       3 |   0 ||   6 |   TABLE ACCESS BY INDEX ROWID | T4       |     10 |     10 |     10 |00:00:00.01 |      10 |   0 |-------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   4 - access("T3"."N"=1100)   5 - access("T3"."ID"="T4"."T3_ID")
In the execution plan, we can see that after the index is added to the predicate condition column of the driver table, the buffer continues to drop from 29 to 17.

Iv. Summary

It can be seen that when SQL optimization is performed, if the table connection mode is nested loop:

First, make sure that a table with a small result set is the driving table, and a table with multiple result sets is the driving table. This does not mean that tables with multiple records cannot be used as driving tables. As long as the result set obtained after filtering by predicate conditions is small, it can also be used as a driving table.

Second,Adding an index to the predicate condition column of the driver table and the connected column of the driver table can significantly improve the execution performance.

Finally, if the columns to be queried are in the index, the execution performance will be further improved when you do not return to the table to query the column information.



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.