Original: http://www.oracle.com/technetwork/issue-archive/2009/09-may/o39asktom-096149.html
how a constraint (constraints) in Oracle affects the query plan
Often people think that constraints are just about data integrity, no problem. But constraints are also used by the optimizer to optimize the execution plan.
The optimizer takes the following resources into the most input inputs:
1) queries to be optimized
2) All database object statistics
3) System statistics, if available (CPU speed, single-block I/o speed, etc. as a measure of physical hardware)
4) Initialization parameters
5) constraints
I often hear people ignoring the use of constraints in a data warehouse/reporting system, on the grounds that the data itself is OK, and that we do data cleansing (they are not happy if the constraint is allowed), but they do need data integrity constraints in order to get a better execution plan. A poor execution plan in the data warehouse can take hours or even days to complete. Below we illustrate the importance of constraints to the execution plan through an example:
1. See how the NOT NULL constraint affects the execution plan:
Code1: Creating data-isolated tables and views
drop table T1;
drop table T2;
Drop View V;
CREATE TABLE T1
As
SELECT * FROM All_objects
where object_type in (' TABLE ', ' VIEW ');
ALTER TABLE T1 modify object_type not null;
ALTER TABLE t1 add constraint t1_check_otype check (object_type in (' Table ', ' VIEW '));
CREATE TABLE T2
As
SELECT * FROM All_objects
where object_type in (' synonym ', ' PROCEDURE ');
ALTER TABLE T2 modify object_type NOT null;
ALTER TABLE t2 add constraint T2_check_otype
Check (object_type in (' synonym ', ' PROCEDURE '));
Create or Replace view V
As
SELECT * FROM t1
UNION ALL
select * from T2;
Code2: Optimize out a table
Set Autotrace traceonly explain
SELECT * from v where object_type = ' TABLE ';
Execution Plan
----------------------------------------------------------------------------
Plan Hash value:3982894595
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40 | 6320 | 151 (1) | 00:00:02 |
| 1 | VIEW | V | 40 | 6320 | (1) | 00:00:02 |
| 2 | Union-all | | | | | |
|* 3 | TABLE ACCESS Full | T1 | 3083 | 475k| 31 (0) | 00:00:01 |
|* 4 | FILTER | | | | | |
|* 5 | TABLE ACCESS full| T2 | 5 | 790 | 12 (1) | 00:00:02 |
-----------------------------------------------------------------------------
predicate information (identified by Operation ID):
-------------------------------
3-filter ("object_type" = ' TABLE ')
4-filter (null is NOT NULL)
5-filter ("object_type" = ' TABLE ')
The strange thing is: We check the data only in T1 there is no T2! And see Filter4:null is not NULL this condition we do not specify, is the execution plan Plus!
This condition is not always false.
To illustrate how the NOT NULL constraint works, let's look at an example:
CODE3:
drop table t;
CREATE TABLE T
As
SELECT * from All_objects;
CREATE index T_idx on T (object_type);
exec dbms_stats.gather_table_stats (user, ' T ');
Select COUNT (*) from T;
Execution Plan
----------------------------------------
Plan Hash value:2966233522
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU) | Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 283 (1) | 00:00:04 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS full| T | 68437 | 283 (1) | 00:00:04 |
-------------------------------------------------------------------
This plan does not use the index that we created. The reason is that the object_type column is nullable, and the index does not contain all the null values, so it cannot be based on the index key value
Count operation. If we tell the database: object_type is not NULL, the execution plan will change instantly!
CODE4:
ALTER TABLE T modify object_type not NULL;
Select COUNT (*) from T;
Execution Plan
------------------------------------------
Plan Hash value:1058879072
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU) | Time |
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 54 (2) | 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST Full scan| T_idx | 68437 | 54 (2) | 00:00:01 |
------------------------------------------------------------------------
Some kind of magic bar!
The problem is that joining the object_type column can be null, and how do I fix it? The answer is that we can create multi-column indexes (combined indexes) of course object_type than in them.
For example:
CODE5:
Drop index T_IDX;
CREATE index T_idx on t (object_type, 0);
Code6:
SELECT * FROM t where object_type is null;
Execution Plan
-----------------------------
Plan Hash value:470836197
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 101 | 1 (0) | 00:00:01 |
| 1 | TABLE ACCESS by INDEX ROWID | T | 1 | 101 | 1 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN | T_idx | 1 | | 1 (0) | 00:00:01 |
--------------------------------------------------------------------------------------
predicate information (identified by Operation ID):
---------------------------------------------------
2-access ("Object_type" is NULL)
2. See how the primary foreign KEY constraint affects the execution plan:
Code7:
drop table emp;
drop TABLE Dept;
Drop View emp_dept;
CREATE TABLE EMP
As
SELECT *
From Scott.emp;
CREATE TABLE Dept
As
SELECT *
From Scott.dept;
Create or Replace view emp_dept
As
Select Emp.ename, Dept.dname
From EMP, dept
where Emp.deptno = Dept.deptno;
--we pretend that EMP and dept are two big tables!
Begin
Dbms_stats.set_table_stats
(User, ' EMP ', numrows=>1000000, numblks=>100000);
Dbms_stats.set_table_stats
(User, ' DEPT ', numrows=>100000, numblks=>10000);
End
/
Code8:
Sql> select ename from Emp_dept;
Execution Plan
-----------------------------
Plan Hash value:615168685
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | TEMPSPC | Cost (%CPU) | Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000k| 31m| | 31515 (1) | 00:06:19 |
|* 1 | HASH JOIN | | 1000k| 31m| 2448k| 31515 (1) | 00:06:19 |
| 2 | TABLE ACCESS full| DEPT | 100k| 1269k| | 2716 (1) | 00:00:33 |
| 3 | TABLE ACCESS full| EMP | 1000k| 19m| | 27151 (1) | 00:05:26 |
----------------------------------------------------------------------------------------
predicate information (identified by Operation ID):
---------------------------------------------------
1-access ("EMP"). DEPTNO "=" DEPT "." DEPTNO ")
Here we only query the ename column of the EMP table, dept performance is not necessary. But Deptno is the primary key of the Dept table, the foreign key of the EMP table! This causes the Deptno column in the EMP table to be non-empty. But we didn't specify the relationship,
Oracle naturally does not know, so we do this:
ALTER TABLE DEPT ADD constraint DEPT_PK primary key (DEPTNO);
ALTER TABLE EMP ADD constraint emp_fk_dept foreign key (DEPTNO) References dept (DEPTNO);
Select ename from Emp_dept;
Execution Plan
------------------------------
Plan Hash value:3956160932
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50000 | 976k| 27152 (1) | 00:05:26 |
|* 1 | TABLE ACCESS full| EMP | 50000 | 976k| 27152 (1) | 00:05:26 |
--------------------------------------------------------------------------
predicate information (identified by Operation ID):
---------------------------------------------------
1-filter ("EMP"). DEPTNO "is not NULL)
It's working! Added a filter.
3. See how a NOT NULL and primary FOREIGN KEY constraint collocation materialized view queries affect the execution plan
I often use materialized views as a "Data Warehouse index", the most important use being "preanswer", which is to store complex and long-running results for a particular table in a permanent table.
To be blunt is to speed up the query.
ALTER TABLE EMP drop constraint emp_fk_dept;
ALTER TABLE dept drop constraint DEPT_PK;
Code10:
Create materialized view MV Enable query rewrite
As
Select Dept.deptno, Dept.dname, COUNT (*) from EMP, dept
where Emp.deptno = Dept.deptno
Group by Dept.deptno, Dept.dname;
Begin
Dbms_stats.set_table_stats
(User, ' MV ', numrows=>100000, numblks=>10000);
End
/
CODE11: A query uses materialized views
Select Dept.dname, COUNT (*) from EMP, dept
where Emp.deptno = Dept.deptno and dept.dname = ' SALES '
Group BY Dept.dname;
Execution Plan
------------------------------
Plan Hash value:1703036361
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | bytes| Cost (%CPU) | Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 22000 | 2716 (1) | 00:00:33 |
| 1 | SORT GROUP by Nosort | | 1000 | 22000 | 2716 (1) | 00:00:33 |
|* 2 | Mat_view REWRITE ACCESS Full | MV | 1000 | 22000 | 2716 (1) | 00:00:33 |
--------------------------------------------------------------------------------------------
predicate information (identified by Operation ID):
---------------------------------------------------
2-filter ("MV". " Dname "= ' SALES ')
Code12: A query does not use materialized views
Sql> Select COUNT (*) from EMP;
COUNT (*)
--------
14
Sql> select * FROM table (dbms_xplan.display_cursor);
Plan_table_output
-----------------------
sql_id g59vz2u4cu404, child number 1
-----------------------
Select COUNT (*) from EMP
Plan Hash value:2083865914
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU) | Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | 27142 (100) | |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS full| EMP | 1000k| 27142 (1) | 00:05:26 |
-------------------------------------------------------------------
Rows selected.
It is clear that the more execution plan should be to implement the computed data in the query materialized view.
To do this, we need to tell the database the following:
DEPTNO in DEPT is a primary key
DEPTNO in EMP is a foreign key
DEPTNO in EMP was a not NULL column
ALTER TABLE DEPT ADD constraint DEPT_PK primary key (DEPTNO);
ALTER TABLE EMP ADD constraint emp_fk_dept foreign key (DEPTNO) References dept (DEPTNO);
ALTER TABLE EMP Modify DEPTNO not NULL;
CODE13:
Sql> Select COUNT (*) from EMP;
COUNT (*)
-------
14
Sql> select * FROM table (dbms_xplan.display_cursor);
Plan_table_output
-----------------------
sql_id g59vz2u4cu404, child number 2
-----------------------
Select COUNT (*) from EMP
Plan Hash value:1747602359
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2716 (100) | |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | Mat_view REWRITE ACCESS full| MV | 100k| 1269k| 2716 (1) | 00:00:33 |
--------------------------------------------------------------------------------------
Query rewriting properties for materialized views Reference: http://blog.itpub.net/28719055/viewspace-1258720/
-------------------------------
Dylan presents.
How a constraint (constraints) in Oracle affects the query plan