Comparison of execution efficiency between SQLin and exists

Source: Internet
Author: User
In SQL, there are three types of in: select * fromt1wheref1in (a, B), which is more efficient than select * fromt1wheref1aorf1b or select * fromt1wheref1aunionallselect * fromt1f1b, you may not be referring to this category. We will not discuss it here.

In SQL, The in can be divided into three types: select * from t1 where f1 in ('A', 'B'), which should be more efficient than the following two types: select * from t1 where f1 = 'A' or f1 = 'B' or select * from t1 where f1 = 'A' union all select * from t1 f1 = 'B ', you may not be referring to this category. We will not discuss it here.

In SQL, there are three types of in:

  • For example, select * from t1 where f1 in ('A', 'B') should be more efficient than the following two types: select * from t1 where f1 = 'A' or f1 = 'B' or select * from t1 where f1 = 'A' union all select * from t1 f1 = 'B ', you may not be referring to this category. We will not discuss it here.
  • For example, if select * from t1 where f1 in (select f1 from t2 where t2.fx = 'X'), the conditions in the where clause of the subquery are not affected by the outer query, in general, automatic optimization is converted into an exist statement, that is, the efficiency is the same as that of exist.
  • For example, select * from t1 where f1 in (select f1 from t2 where t2.fx = t1.fx). The conditions in the where clause of the subquery are affected by the outer query, the efficiency of such queries depends on the index of the fields involved in the related conditions and the amount of data. It is generally considered that the efficiency is not as high as exists. Except for the first type of in statements, the SQL statements can be converted into exists statements. The general programming habit is to use exists instead of in, but seldom consider the execution efficiency of in and exists.

Tables A and B

  • When only the data of one table is displayed, such as A, and only one relational condition such as ID, using IN is faster: select * from A where id in (select id from B)
  • When only the data of one table is displayed, such as A, and the link condition is more than one such as ID and col1, it is inconvenient to use IN. You can use EXISTS: select * from A where exists (select 1 from B where id =. id and col1 =. col1)
  • When only two tables are displayed, IN and EXISTS are not suitable. Use join: select * from A left join B on id = A. id

Therefore, the method used depends on the requirements.

This is a general test:

set statistics io on select * from sysobjects where exists (select 1 from syscolumns where id=syscolumns.id) select * from sysobjects where id in (select id from syscolumns ) set statistics io off 
(47 rows affected) Table 'syscolpars '. 1 scan count, 3 logical reads, 0 physical reads, 2 pre-reads, 0 lob logical reads, 0 lob physical reads, and 0 lob pre-reads. Table 'sysschobjs '. 1 scan count, 3 logical reads, 0 physical reads, 0 pre-reads, 0 lob logical reads, 0 physical reads, and 0 lob pre-reads. (One row is affected)
(44 rows affected) Table 'syscolpars '. 47 scans, 97 logical reads, 0 physical reads, 0 pre-reads, 0 lob logical reads, 0 physical reads, and 0 lob pre-reads. Table 'sysschobjs '. 1 scan count, 3 logical reads, 0 physical reads, 0 pre-reads, 0 lob logical reads, 0 physical reads, and 0 lob pre-reads. (One row is affected)
set statistics io on select * from syscolumns where exists (select 1 from sysobjects where id=syscolumns.id) select * from syscolumns where id in (select id from sysobjects ) set statistics io off 
(419 rows affected) Table 'syscolpars '. 1 scan count, 10 logical reads, 0 physical reads, 15 pre-reads, 0 lob logical reads, 0 lob physical reads, and 0 lob pre-reads. Table 'sysschobjs '. 1 scan count, 3 logical reads, 0 physical reads, 0 pre-reads, 0 lob logical reads, 0 physical reads, and 0 lob pre-reads. (One row is affected)
(419 rows affected) Table 'syscolpars '. 1 scan count, 10 logical reads, 0 physical reads, 0 pre-reads, 0 lob logical reads, 0 physical reads, and 0 lob pre-reads. Table 'sysschobjs '. 1 scan count, 3 logical reads, 0 physical reads, 0 pre-reads, 0 lob logical reads, 0 physical reads, and 0 lob pre-reads. (One row is affected)

Test results (in general, exists is more efficient than in ):

Efficiency: Conditional indexing is critical.

Use syscolumns as a condition: syscolumns data is greater than sysobjects

47 in scans, 97 logical reads, 1 exists scans, and 3 logical reads. Use sysobjects as the condition: the data of sysobjects is less than syscolumns, and exists is more than 15 times in preread.

If you want to query the maximum sid of each category

select * from test a   where not exists(select 1 from test where sort = a.sort and sid > a.sid) 

Ratio

select * from test a   where sid in (select max(sid) from test where sort = a.sort) 

The execution efficiency is higher than three times.

In SQL optimization, use in and exist? It mainly depends on whether your filtering conditions are in the primary query or subquery.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.