SQL optimization: optimization of the connection sequence, condition sequence, in and exist of count and table, countexist

Source: Internet
Author: User
Tags comparable

SQL optimization: optimization of the connection sequence, condition sequence, in and exist of count and table, countexist

This article describes how to optimize the connection sequence, condition sequence, in and exist of count and table in SQL optimization, which is very useful! Details are as follows:

1. About count

I have read some online articles about count (*) and count (Column). Is the efficiency of count (column) more efficient than count?

In fact, I personally think that count (*) and count (column) are not comparable at all. count (*) counts the total number of rows in the table, while count (column) counts the number of non-empty records in a column.

However, we can compare them through experiments:

First, create a test table:

drop table test purge;create table test as select * from dba_objects;update test set object_id =rownum ;set timing on set linesize 1000set autotrace on 

Run

select count(*) from test;select count(object_id) from test;

The time consumed is the same. Is their efficiency the same?

Let's try to create an index on the column object_id.

create index idx_object_id on test(object_id);

Then execute

select count(*) from test;select count(object_id) from test;

It is found that the speed of count (object_id) is significantly higher than that of count (*). Is it because count (object_id) can use indexes, so the efficiency has improved a lot?

Modify the column attribute of object_id.

alter table test modify object_id not null;

Then execute

select count(*) from test;select count(object_id) from test;

It is found that their speed is as fast as they are, and count (*) can also be used to index.
In fact, the premise of efficiency comparison is that the writing of the two statements should be equivalent. These two statements are not equivalent at all, so they are not comparable.

For the oracle optimizer, we can use experiments to find that the statistical time varies depending on the count column. The general trend is that the closer the column is to the back, the higher the access overhead, the column offset determines the access performance. The overhead of count (*) is irrelevant to the offset. Therefore, in some cases, count (*) is the fastest.

2. About in and exist

Most of the statements about in and exist indicate that in is more efficient than exist. Therefore, exist must be used to replace in. But is that true?

Next we will do an experiment:

In Oracle 10 Gb;

select * from dept where deptno NOT IN ( select deptno from emp ) ;select * from dept where not exists ( select deptno from emp where emp.deptno=dept.deptno) ;

We found that exist is indeed more efficient than in. This statement seems to be true.

But we will execute the following statement again.

select * from dept where deptno NOT IN ( select deptno from emp where deptno is not null) and deptno is not null;

You will find that the efficiency of in and exist is the same with that of non-null constraints.

View the execution plans of the three statements, and you will find that the in statement and the exist statement without any non-null constraints go through the ANTI semi-join algorithm, so the efficiency is the same, the in statement without non-null constraints uses filter instead of ANTI algorithm, so the efficiency is lower.

Therefore, we can draw a conclusion: in oracle 10 Gb, if it can ensure that it is not empty, the in constraint can use the ANTI semi-join algorithm. At this time, the efficiency is the same as that of exist.

In Oracle 11g:

select * from dept where deptno NOT IN ( select deptno from emp ) ;select * from dept where not exists ( select deptno from emp where emp.deptno=dept.deptno) ;

We found that the efficiency of the two statements is the same, and the Execution Plan is also the same. Oracle has been optimized in 11 GB, so the efficiency of in and exist is the same.

From this we can conclude that in 11g, the efficiency of using in and exist is the same, because they all use relatively efficient ANTI algorithms.

Iii. Connection sequence of large and small tables

On the Internet, we can see a lot of such articles. When performing multi-table queries, we can use small tables or cross tables as basic tables, put them behind, and put large tables behind the from, because the table access sequence is from right to left.

But is that true?

We can perform an experiment to verify the environment (Oracle 11g is used here ):

create table tab_big as select * from dba_objects where rownum<=30000;create table tab_small as select * from dba_objects where rownum<=10;set autotrace traceonlyset linesize 1000set timing on select count(*) from tab_big,tab_small ; select count(*) from tab_small,tab_big ;

We can check the execution plan and find that the two statements are the same in efficiency. Is the table order and efficiency irrelevant to multi-table queries?

We are executing the following statement:

select /*+rule*/ count(*) from tab_big,tab_small ; select /*+rule*/ count(*) from tab_small,tab_big ;

We can clearly find that the query efficiency is much higher for small tables on the right and large tables on the left.

In fact, in the rule-based era, query efficiency is related to the order of table connections. Small tables or cross tables are left, while large tables are executed on the right more efficiently. But now it is basically a price-based era, so the order and efficiency of the size table are irrelevant. The oracle optimizer will automatically optimize the efficiency.

4. Order of join conditions in the where clause

In the rule-based era, oracle uses a bottom-up sequence to parse the where clause. Based on this principle, we usually put the table with the least number of rows returned at the end, the where clause contains a filter condition at the end of the clause.

However, in the current price-based era, oracle optimizer has been used to optimize the optimization. Therefore, the order of the table and the order of the condition will not affect our query efficiency.




Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.