Count of SQL optimizations, table connection order, condition order, in and exist

Source: Internet
Author: User
Tags comparable

First, about Count

Having read some articles about count (*) and Count (column), is the efficiency of Count (column) certainly higher than count (*)?

In fact, the individual think that count (*) and Count (column) is not comparable at all, COUNT (*) is the total number of bars in the table, and Count (column) is the number of non-empty records when the column.

But we can compare them by experiment:

First create the test table:

drop table test Purge;
CREATE TABLE Test as SELECT * from Dba_objects;

Update test set object_id =rownum;
Set timing on
Set Linesize 1000
Set Autotrace on


Perform

Select COUNT (*) from test;
Select COUNT (object_id) from test;

Discover that time-consuming is the same, is their efficiency actually the same?

Let's try creating an index on column object_id

CREATE INDEX idx_object_id on test (object_id);

And then execute

Select COUNT (*) from test;
Select COUNT (object_id) from test;

Is it obvious that count (object_id) is significantly higher than count (*), because the index is used by count (object_id), so efficiency is improved a lot?

Let's modify the column properties of the OBJECT_ID

ALTER TABLE test modify OBJECT_ID NOT null;

And then execute

Select COUNT (*) from test;
Select COUNT (object_id) from test;

found that their speed is actually as fast as the count (*) is also available to the index.

In fact, the premise of efficiency comparison is that two statements should be equivalent, the two formulations are not equivalent, and therefore not comparable.

For the Oracle Optimizer, we can experiment to find that the count of different columns, the time of the statistics is not the same, the approximate trend is that the column is more expensive, the higher the cost of access, the column offsets the performance of the Jedi Access. The cost of Count (*) is independent of the offset. Therefore, on some occasions count (*) is the fastest.


Ii. about in and exist

About in and exist is mostly said in the efficiency than exist high, so there must be in the place to be replaced exist and so on. But is that really the case?

We can do an experiment:

In the 10g;

SELECT * FROM dept where deptno not in (select Deptno from EMP);
SELECT * FROM dept where not exists (select Deptno from emp where Emp.deptno=dept.deptno);

We find that exist is indeed more efficient than in. This statement seems to be set up AH.

But we'll execute the following statement

SELECT * FROM dept where Deptno not in (the Select Deptno from emp where deptno are NOT null) and DEPTNO are NOT null;

You will find that the efficiency of in and exist is the same after adding a non-empty constraint.

View the execution plan for three statements you will find that the in statement and the exist statement without the non-null constraint are all anti, so the efficiency is the same, and the in statement without the non-null constraint is using the filter instead of the anti algorithm, so the efficiency is worse.

So we can conclude that in Oracle 10g, if you can ensure non-null, the in constraint can use the Anti semi-join algorithm, at which time the efficiency and exist are the same.


In 11g:

SELECT * FROM dept where deptno not in (select Deptno from EMP);
SELECT * FROM dept where not exists (select Deptno from emp where Emp.deptno=dept.deptno);

We found that the efficiency of the two statements was the same, as was the view of the execution plan. Oracle has been optimized in 11g, so the efficiency of in and exist is the same.

From this we can conclude that in 11g, the efficiency of using in and exist is the same, because they are taking more efficient anti algorithms.


Third, about the connection order of the Size list

On the internet we can see a lot of such articles, in the multi-table query, with a small table or cross-table to do the base table, put in the back, the big table is placed in the back of the position, because the table is the order of access from right to left.

But is that really the case?

We can do the experiment to verify (11g):

CREATE TABLE Tab_big as SELECT * from Dba_objects where rownum<=30000;
CREATE TABLE Tab_small as SELECT * from Dba_objects where rownum<=10;
Set Autotrace traceonly
Set Linesize 1000
Set timing on
Select COUNT (*) from Tab_big,tab_small;
Select COUNT (*) from Tab_small,tab_big;

As we look at the execution plan, we can see that the efficiency of the two statements is the same, is it not related to the order and efficiency of multiple table queries, tables?

We are executing the following statement:

Select/*+rule*/Count (*) from Tab_big,tab_small;
Select/*+rule*/Count (*) from Tab_small,tab_big;

We can clearly find that the small table in the right, big table in the left statement, query efficiency is much higher.

In fact, in the rule-based era, the query efficiency is related to the connection order of the table, small table or cross-table in the left, large table on the right of the execution efficiency will be higher. But now it's basically a cost-based era, so the order and efficiency of the size tables are irrelevant, and the Oracle optimizer automatically optimizes for efficiency.


Iv. Join condition Order in the WHERE clause

In the rule-based era, Oracle uses a bottom-up order to parse the WHERE clause, and according to this principle, we typically place the table with the lowest possible return rows in the last face, and the clause in the WHERE clause with the filter condition placed on the last side.

But in today's cost-based era, this optimization has been optimized with the help of the Oracle optimizer, so the order and condition of the tables have no effect on our query efficiency.




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.