When in and exists are executed, in first executes the query in the subquery, and then executes the primary query. The exists query first executes the primary query, that is, the outer table query, and then executes the subquery. Exists and in have the same execution time as the execution time. if exists needs to be slightly better than in and exists, in executes the query in the subquery first, and then executes the primary query. The exists query first executes the primary query, that is, the outer table query, and then executes the subquery.
Exists and in have the same validity period as in execution time. exists is slightly better than in. Generally, exists is used instead of in.
If the subquery results have fewer records, the primary query tables are large and indexed, the in statement should be used. if the outer primary query has fewer records, exists is used when the table in the subquery is large and there is an index. IN does not process NULL.
When not exists and not in are compared, not exists is more efficient.
To describe the test results, I have collected 315392 data records from the emp1 table. Only two conditions of data are deleted in emp2. The test is based on the execution time.
Data records in emp1.
SQL> select count (*) from emp1;
COUNT (*)
----------
315392
Data Record in emp2:
SQL> select count (*) from emp2;
COUNT (*)
----------
2
1. execute the exists query and query the total number of data that exists in emp2 in emp1.
SQL> select count (*) from emp1 where exists (select null from emp2 where emp1.ename = emp2.ename );
COUNT (*)
----------
45056
Ten executions, up to 0.125 S
2. use not exists to query the total number of data that is not in emp2.
SQL> select count (*) from emp1 where not exists (select null from emp2 where emp1.ename = emp2.ename );
COUNT (*)
----------
270336
Ten executions, up to 0.141 S
3. execute the in query and query the total number of data that exists in emp2 in emp1.
SQL> select count (*) from emp1 where ename in (select ename from emp2 );
COUNT (*)
----------
45056
Ten executions, up to 0.141 S
4. use not in to query the total number of data that is not in emp2.
SQL> select count (*) from emp1 where ename not in (select ename from emp2 );
COUNT (*)
----------
270336
Ten executions, up to 0.328 S
5. Use in to query and call the outer and subquery locations. you must query the number of data records that exist in emp2 and exist in emp1.
SQL> select count (*) from emp2 where ename in (select ename from emp1 );
COUNT (*)
----------
2
Ten executions, up to 0.047 S
6. to use exists to query and call the outer and subquery locations, you must query the number of data records that exist in emp2 and exist in emp1.
SQL> select count (*) from emp2 where ename in (select ename from emp1 );
COUNT (*)
----------
2
Ten executions, up to 0.047 S
To sum up, when using in and exists, I personally think that the efficiency is similar. When not in and not exists are compared, not exists is more efficient than not in.
When in is used, the where condition of the subquery is not affected by the outer layer. automatic optimization is converted into an exist statement, which is the same as exist. (Not verified)
For example, if select * from t1 where f1 in (select f1 from t2 where t2.fx = 'x'), the in and exists are considered to be of the same efficiency.
IN is suitable for the case where the external table is large but the internal table is small; EXISTS is suitable for the case where the external table is small but the internal table is large.