SQL exists, in, sqlexistsin

Source: Internet
Author: User

SQL exists, in, sqlexistsin
Tips:
1. When the External table is a large table and the inner table is a small table, use exist
2. When the External table is a small table and the internal table is a large table, the in


Example: The External table is large and the internal table is small.
Create table outTable (id1 int );
Insert into outtable select generate_series );


Create table inTable (id1 int );
Insert into inTable values (1), (10000), (100000), (1000000 );


Test = # explain select count (*) from outtable big where exists (select id1 from inTable small where small. id1 = big. id1 );
QUERY PLAN
Bytes -----------------------------------------------------------------------------------------
Aggregate (cost = 18968. 50 .. 18968.51 rows = 1 width = 0)
-> Hash Join (cost = 44. 50 .. 17718.50 rows = 500000 width = 0)
Hash Cond: (big. id1 = small. id1)
-> Seq Scan on outtable big (cost = 0. 00 .. 13922.00 rows = 1000000 width = 4)
-> Hash (cost = 42. 00 .. 42.00 rows = 200 width = 4)
-> HashAggregate (cost = 40. 00 .. 42.00 rows = 200 width = 4)
-> Seq Scan on intable small (cost = 0. 00 .. 34.00 rows = 2400 width = 4)
(7 rows)


Time: 3.743 MS
Test = # explain select count (*) from outtable big where id1 in (select id1 from inTable small where small. id1 = big. id1 );
QUERY PLAN
------------------------------------------------------------------------------
Aggregate (cost = 20032672. 00 .. 20032672.01 rows = 1 width = 0)
-> Seq Scan on outtable big (cost = 0. 00 .. 20031422.00 rows = 500000 width = 0)
Filter: (SubPlan 1)
SubPlan 1
-> Seq Scan on intable small (cost = 0. 00 .. 40.00 rows = 12 width = 4)
Filter: (id1 = big. id1)
(6 rows)


Time: 1.286 MS
Test = # select count (*) from outtable big where exists (select id1 from inTable small where small. id1 = big. id1 );
Count
-------
4
(1 row)


Time: 272.027 MS
Test = # select count (*) from outtable big where id1 in (select id1 from inTable small where small. id1 = big. id1 );
Count
-------
4
(1 row)


Time: 4021.244 MS




Small appearance, large internal table
Test = # select count (*) from intable small where exists (select id1 from outtable big where big. id1 = small. id1 );
Count
-------
4
(1 row)


Time: 4792.643 MS
Test = # select count (*) from intable small where id1 in (select id1 from outtable big where big. id1 = small. id1 );
Count
-------
4
(1 row)


Time: 223.778 MS
Test = # explain select count (*) from intable small where exists (select id1 from outtable big where big. id1 = small. id1 );
QUERY PLAN
Bytes ----------------------------------------------------------------------------------------
Aggregate (cost = 33336. 10 .. 33336.11 rows = 1 width = 0)
-> Hash Semi Join (cost = 29840. 00 .. 33333.10 rows = 1200 width = 0)
Hash Cond: (small. id1 = big. id1)
-> Seq Scan on intable small (cost = 0. 00 .. 34.00 rows = 2400 width = 4)
-> Hash (cost = 13922. 00 .. 13922.00 rows = 1000000 width = 4)
-> Seq Scan on outtable big (cost = 0. 00 .. 13922.00 rows = 1000000 width = 4)
(6 rows)


Time: 1.021 MS
Test = # explain select count (*) from intable small where id1 in (select id1 from outtable big where big. id1 = small. id1 );
QUERY PLAN
------------------------------------------------------------------------------
Aggregate (cost = 19706446. 00 .. 19706446.01 rows = 1 width = 0)
-> Seq Scan on intable small (cost = 0. 00 .. 19706443.00 rows = 1200 width = 0)
Filter: (SubPlan 1)
SubPlan 1
-> Seq Scan on outtable big (cost = 0. 00 .. 16422.00 rows = 1 width = 4)
Filter: (id1 = small. id1)
(6 rows)


Time: 3.578 MS




1. Differences between exists and in
We can see that in is a set operator, a in {a1, a2, a3}
In is an element, followed by a set, that is, to determine whether an element is in the set.
We can see from the above that id1 in (select id1 from outtable big where big. id1 = small. id1); -- If id1 is in the set after in, the row is calculated in count (*); otherwise, it is not counted in count.


Exists is an existence judgment. If there is a result after exists, it is true; otherwise, it is false.
See the following example:
Test = # select count (*) from outtable big where exists (select id1 from outtable where big. id1> 1 );
Count
--------
999999
(1 row)


Time: 488.649 MS
Test = # select count (*) from outtable big where exists (select id1 from outtable where id1> 1 );
Count
---------
1000000
(1 row)


Time: 313.216 MS


Why are the results different?
(Select id1 from outtable where id1> 1) We know that there is a returned result, and there is no impact on the count (*) before exists, that is, any row of the table, therefore, it is equivalent to elect count (*) from outtable big; that is, there is no restriction.
(Select id1 from outtable where big. id1> 1) We know that there is a returned result, and exists has restrictions on the previous statement, that is, big. id1> 1 is equivalent to select count (*) from outtable big where big. id1> 1;




Select * from Table A where exists (select * from Table B where Table B. id = Table A. id)
This sentence is equivalent
Select * from Table A where id in (select id from Table B)
For each piece of data in Table A, select * from Table B where Table B is executed. id = Table. id. If table B has the same id as the current row of table A, exists is true, and this row is displayed; otherwise, the row is not displayed.


Tips:
1. IN general, not exists is faster than not in, because not exists can use a combination of algorithms, but not in does NOT work, and EXISTS is less fast than IN, at this time, IN may use more combined algorithms.
2. Usually, exists does not take the index, while in does the index.


Reference
1. http://www.cnblogs.com/a-zx/articles/1749957.html
What are the differences between SQL statements IN and EXISTS?

IN fact, it is similar to equals. For example, in () is a simple method of writing = 1 or = 2, so IN is generally used when there are few elements. If there are many elements, exists is used.

The usage of exists is different from that of in. Generally, it needs to be associated with the sub-table. in addition, you need to use an index to accelerate the association.

Your SQL statement can be written as NOT EXISTS
Select MC001 from bommc where not exists (SELECT MD001 from bommd where BOMMC. MC001 = BOMMD. MD001)

In SQL is different from exists.

Assume the following applications:
Two tables, user table TDefUser (userid, address, phone) and consumption table TAccConsume (userid, time, amount), need to query the user records that consume more than 5000.
Use exists:
Select * from TDefUser
Where exists (select 1 from TAccConsume where TDefUser. userid = TAccConsume. userid and TAccConsume. amount> 5000)
In:
Select * from TDefUser
Where userid in (select userid from TAccConsume where TAccConsume. amount> 5000)

Generally, exists is more efficient than in.

The subquery after exists () is called a subquery and does not return the value of the list. only returns a true or false result (this is also why the subquery is "select 1". If it is changed to "select 6", the select field can also be returned, but it is obviously less efficient)
The running mode is to run the primary query first and then query its corresponding results in the subquery. If it is ture, the output is not displayed. then, query in the subquery based on each row in the primary query.

The subquery after in () returns the result set. in other words, the execution order is different from that of exists. the subquery first generates a result set, and then the primary query goes to the result set to find the list of fields that meet the requirements. output that meets the requirements, otherwise no output.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.