Oracle Multi-Table connection method hash join Nested Loop join Merge Join

Source: Internet
Author: User
Tags joins

In the ViewSQLWhen we execute the plan, we find that there are many ways to connect tables, and this article introduces how tables are connected in order to better understand the execution plan and understand the principles of SQL execution.
First, the connection method:
Nested Loops (Nested Loops (NL))
(hash) Hash connection (hash join (HJ))
(merge) sort merge joins (sort merge Join (SMJ))

Second, connection description:
1.Oracle can only connect two tables at a time. Regardless of the number of tables in the query,OracleOnly two tables can be manipulated at a time in a connection.
2. When performing a connection to multiple tables, the optimizer starts with a table, joins it to another table, then joins the intermediate result to the next table, and so on until all tables are processed.

Third, the table connection detailed:
1. Nested Loops (Nested Loops (NL)):
Nested loop implementation mechanism (pseudo code):
For R1 in (select rows from Table_1 where Colx={value})
Loop
For R2 in (select rows from table_2, that match, current row from Table_1)
Loop
Output values from the current row of table_1 and current row of table_2;
End Loop;
End Loop;
This code is composed of two loops.
These two tables in a nested loop are often referred to asExternal Tables (Outer table) and internal tables (inner table).
In a nested loop join, the outer table is also known asDriver Table (driver table)
pseudo code: Table_1 as Driver table, table_2 as inner table
It can be seen from pseudocode that the connection process is a 2-layer nested loop, so the less the number of outer loops the better, which is why we use a small table or a table that returns a smaller result set as the driving table.
NEST LOOP JOIN cost = costs to get data from the first table + the cardinality of the results from the first table Х the cost of accessing the second table once
Therefore, nested loops are generally suitable for fewer driver table recordsets (<10000) and the inner table has an efficient index access method.
Use USE_NL (table_1 table_2) to force the CBO to perform nested loop joins.
Driver Table Determination: The driver table "select rows from Table_1 where Colx={value}" is generally a table with a smaller result set based on the where condition, not necessarily a smaller table for the entire table record.
         
  2. (hash) hash connection (hash join (HJ)):
A Hash join is typically used for a small table and a large table for a join. In most cases, the hash join efficiency is more efficient than the other join methods.
For a detailed understanding of the hash join, you can see a more thorough article written on the Web:HTTP://WWW.HELLODBA.COM/READER.PHP?ID=144&LANG=CN

3. Sort merge joins (sort merge Join (SMJ)):

Generally, the effect of a hash join is better than a sort merge connection, but if the row source has already been sequenced and does not need to be sorted when the sort merge connection is performed, the performance of the sort merge connection is better than the hash join. You can use Use_merge (table_1 table_2) to force the use of sort merge connections.
Procedure: Sort two tables, and then merge the two tables after sorting.

Iv. Summary of connection methods:
1)) nested loop (Nest Loop):
Nested loops are a good choice for cases where the subset of data being connected is small. In a nested loop, the outer surface drives the inner table, and each row returned by the surface is retrieved in the inner table to find the row it matches, so the result set returned by the entire query cannot be too large (more than 10000 inappropriate), the table with the smaller returned subset as the Appearance (driver table), and must have an index on the join field of the inner
2) Hash connection (hash join):
Hash joins are a common way of connecting large datasets, using smaller tables in two tables, using connection keys to create a hash table in memory, and then scanning large tables and detecting hashes to find rows that match the hash list.
This applies to a situation where a smaller table can be completely put into memory, so the cost is the sum of the cost of accessing the two tables. However, when the table is very large, it cannot be completely put into memory, when the optimizer divides it into several different partitions, which cannot be put into the memory section to write the partition to a temporary segment of the disk.
Hash connections can only be applied toequivalent Connection(such as where a.col3 = B.col4), non-equivalent connections (where A.col3 > B.col4), outer joins (where A.col3 = B.col4 (+)).
3) Sort Merge connection (sort merge Join)
Generally, hash joins are better than sort merge connections. However, if the row source has already been sequenced and does not need to be reordered when performing a sort merge join, the performance of the sort merge connection is worrying about the hash connection.

Five, the connection mode application scenario:
1. Hash connection only applies to equivalent connections.
2. Nested loops are row source connections and are only suitable for small amounts of data connections.
Hash joins and Sort merge connections are collection connections and are suitable for a large number of data connections.
3. In the equivalent connection mode, a small number of records (&LT;10000) are returned and the internal table has an index on the connection column, which is suitable for nested loops. A hash connection is appropriate if a large number of records are returned.
4. In the equivalent connection mode, two row source collections are large, and if the connection column is a high cardinality column, it is appropriate to hash the connection, otherwise it is appropriate to sort the merge connection.
5. A nested loop connection can return rows that have already been connected without having to wait for all of the connection operations to finish processing before returning data. The other two ways to connect are not.
6. The two datasets of a sort merge connection can be processed in parallel, while nested loops and hash connections cannot.

Excerpt from: http://www.itpub.net/thread-1611025-1-1.html

Oracle Multi-Table connection method hash join Nested Loop join Merge Join

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.