The difference between plsql exist and in

Source: Internet
Author: User

<! [endif]--> <! [endif]-->

I find it necessary to study the company colleagues who like to use exists and in to do sub-query association.

The difference between the two, for reference and memo

/* (This information comes from network begin) for in, it is equivalent to executing a subquery with distinct on inner table, and then connecting the resulting set of results with outer table. Connection methods and indexes use the same connection as the normal two tables (this information comes from the network end) */

For this description of the network, I gave the test, the test table for Liomuser.staff, and liomuser.department, both tables are small tables, the number is about 10,000.

--  For example:

SELECT *

From Liomuser.staff

where department_id in (select department_id from liomuser.department);

--  can be converted to

Select A.*

From Liomuser.staff A,

(select distinct department_id from liomuser.department) b

where a.department_id = b.department_id;

The implementation plans are as follows:

(1) SELECT *

From Liomuser.staff

where department_id in (select department_id from liomuser.department);

(2) Select A.*

From Liomuser.staff A,

(select distinct department_id from liomuser.department) b

where a.department_id = b.department_id;

I choose two small tables, from the data on the way to adopt an external connection in addition to a consistent reading to a little bit smaller, both execution plan and statistical information almost the same.

The test results show that the descriptions given on the small table network are correct

However, in my experience, in the way should be much worse than the external connection performance, according to the above test, the two seem to be the same execution path, is not the table data should be less than the reason?

I decided to use two large tables to do the test, Cust_order and Order_Detail both tables have data volume above 10 million.

First Test in, with the following statement:

Select A.*

From Liomuser.cust_order A

where order_id in (select order_id from Liomuser.order_detail b);

The implementation plan is as follows:

To test the 2 outer JOIN, the statement is as follows:

Select A.*

From Liomuser.cust_order A,

(select distinct order_id from Liomuser.order_detail) b

where a.order_id =. order_id;

The implementation plan is as follows:

Comparing the in and outside connections of the two large tables, it can be seen that the execution plan using the outer connection is obviously better than in the way, using in mode the table connection using nested loop method, the outer connection uses the hash join,

And in the way the cpucost is larger than the outer connection 1/3, so it seems, for the small table, or inner table is a small table query, in and outside the connection is similar, but for large tables, especially the inner table data volume is very large, the use of external connections is much better.

In this sense, in is not exactly equivalent to distinct outside of the inner table, but the outer joins are much more efficient than in.

Below is a discussion of EXIST

In fact exists is equivalent to scanning the outer table first, retrieving each row and inner table to do a loop match, the execution plan is as follows:

Note: Some online data suggest that exists to outer table for a full table scan, but no full table scan was found in the execution plan, and the index was still gone.

The Exists can be translated into:

DECLARE CNT number (10);

For cur in (select a.* from Liomuser.cust_order a) loop

cnt:= 0;

Select COUNT (1) into CNT from Liomuser.order_detail where order_id=cur.order_id;

If cnt<> 0 Then

return cur;

End If;

End Loop;

Exists vs. in:

Statement 1, in

Statement 2, Exsits

There is no difference from the execution plan, so let's look at the statistics for execution:

Statement 1, in

Select A.*

From Liomuser.cust_order A

where order_id in (select order_id from Liomuser.order_detail b)

Statement 2, exists

Select A.*

From Liomuser.cust_order A

where exists

(select 1 from liomuser.order_detail b where a.order_id = b.order_id)

From the two ways statistics can be seen, the use of exists consistency reading is better than in, but the bytessent is higher than in, this also from the side to verify that the previous exists equivalent loop

It is now easy to understand that when the inner table data is large and the index is not good (a large number of duplicates, etc.) it is not advisable to use a large in operation resulting from the inner table retrieval, suggesting that the query innertable too big, take exsits, or external connection mode

In addition: The NOT IN clause performs an internal sort and merge. In either case, not in is

Minimum effect (it performs a full table traversal of the table in the subquery). To avoid using not, we can change it to an outer join (Outer Joins) or not EXISTS

The difference between plsql exist and in

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.