Oracle Learning Performance Optimization (vii) How join IS implemented

Last Update:2015-08-20 Source: Internet

Author: User

Tags dname joins sorts

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article discusses the mechanisms behind join technology. We know that common table joins have the following kinds of

Descartes Connection
Internal connection
Left outer connection
Right outer connection
Fully connected

As you can see in these SQL notation, how do these connected data accesses work?

Nested loop

Let's look at the following query

sql> alter session set optimizer_mode=rule; Session altered. sql> select ename,dname from emp,dept where emp.deptno=dept.deptno;14  Rows selected. Execution plan----------------------------------------------------------plan hash value:  3625962092------------------------------------------------| id  | operation      | name    |------------------------------------------------|    0 | SELECT STATEMENT     |        | |    1 |  NESTED LOOPS     |        | |    2 |   NESTED LOOPS     |        | |    3 |    TABLE ACCESS full     | emp     | | *  4 |    index unique scan     |  pk_dept | |    5 |   TABLE ACCESS BY INDEX ROWID| DEPT     |------------------------------------------------predicate information  (identified  by operation id):---------------------------------------------------   4 -  access ("EMP"). DEPTNO "=" DEPT "." DEPTNO ")

According to our previous interpretation of the implementation plan, this query is implemented as follows:

Full table Scan EMP table (non-blocking scan, do not take all the data out before performing the next step).
The data in the EMP is fetched out by index pk_dept, and the result set becomes (ENAME,ROWID) by querying the ROWID in the index.
The result set of 2 is fetched out by ROWID to access the Dept table, and the result set becomes (Ename,dname)
Returns the result set.

This join implementation, which takes the data out of the loop, is called a nested loop.

This plan can be implemented with the following logical pseudo-code

For y in (for x in (SELECT * from emp) loop index Lookup the ROWID for X.deptno output joined record (ename, DEPT.ROWID) end loop) Loop select * FROM dept where rowid=y.rowid Output joined record (ename,dname) end loop

We refer to the EMP table as the driver table (Note that the driver table is independent of the table order of the FROM clause, mainly the execution plan).

This kind of connection method, applies to the driver table to return the data to be few, and the Deptno column has the index on the Driver table dept. If the query returns n rows, then the Dept table will be scanned n times. This connection specializes in quickly extracting the first row from the result set.

Hash Join

Hash join is suitable for processing large result sets, the optimizer chooses two tables or smaller in the source data, using the join key to create a hash table in memory. Then scan the large table and probe the hash table to find the matching records.
A small table is called a driver table, and a large table is called a profiling table

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/71/9B/wKioL1XUrsiwbid2AAHA4Wz1Vxc804.jpg "title=" hash Join.jpg "alt=" Wkiol1xursiwbid2aaha4wz1vxc804.jpg "/>

This is best when hash table can be put all in memory. If the hash table is not placed in memory, the optimizer will partition the hash table and the partition beyond the memory range will be written to the temporary tablespace.

We discuss the implementation of the hash join in two different situations.

Hash table All in memory

Hash table is a hash function used by Oracle to split a small table into buckets based on join key benefit. After the hash table is established, Oracle scans the large table and uses the same hash algorithm to divide the read data into buckets. The join operation between the bucket and the bucket returns the result. Until the big table is finished reading.

2. Hash table cannot be put into memory

Using the algorithm, the small table is partitioned first (partition) so that a partition can be loaded into memory. The large table is also made the same partition. The partition of the table and the partition are taken out of the data in a 1 implementation manner. After you finish a partition, load another partition into memory and continue the same operation until the data is all processed.

sql> insert into big_emp select * from big_emp; sql> insert into big_emp select * from big_emp; #重复执行多次SQL > / 458752 rows created. sql> create table dept_new as select * from dept; Table created. Sql> set autot traceonlysql> select * from big_emp a,dept_new  b where a.deptno=b.deptno;917504 rows selected. Execution plan----------------------------------------------------------plan hash value:  1925493178-------------------------------------------------------------------------------| id   | operation   | name     | rows  | bytes  | Cost  (%CPU) | time     |--------------------------------------- ----------------------------------------|&NBSP;&Nbsp; 0 | select statement   |      |    917k|    54m|  1490 (2) | 00:00:18 | | *  1 |  hash join   |      |    917k|    54m|  1490 (2) | 00:00:18 | |    2 |   TABLE ACCESS FULL| DEPT_NEW |      4 |   120 |     3 (0) | 00:00:01 | |    3 |   TABLE ACCESS FULL| BIG_EMP  |    917k|    28m|  1482 (1) | 00:00:18 |------------------------------ -------------------------------------------------predicate information  (identified by  Operation id):---------------------------------------------------   1 - access ("A". ") DEPTNO "=" B "." DEPTNO ") Note-----   - dynamic sampling used for this statement   (level=2) Statistics----------------------------------------------------------  4   recursive calls  1  db block gets      66338   consistent gets  0  physical reads  0  redo  size   62512398  bytes sent via sql*net to client      673349  bytes received via SQL*Net from  Client      61168  sql*net roundtrips to/from client   0  sorts  (memory)   0  sorts  (disk)       917504  rows processed

Sort Merge Joins

This article is from the "Ding Dong" blog, please be sure to keep this source http://lqding.blog.51cto.com/9123978/1686257

Oracle Learning Performance Optimization (vii) How join IS implemented

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More