1. Functions of statistical information
The CBO-based optimizer in Oracle relies heavily on statistics when generating execution plans. You can understand CBO as a complex mathematical model, statistics are the most important input, and execution plans are outputs. If the input is not accurate, can the output be accurate? All, whether the statistics are timely and effective has a critical impact on the performance of the execution plan.
2. dbms_stats package
Oracle using dbms_stats package analysis statistics (Analyze command is outdated, not recommended), the use of this package, the official documentation has a detailed description (http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/d_stats.htm#i1036461 ), here are some of the most common examples.
1) gather_table_stats
DBMS_STATS.GATHER_TABLE_STATS (
Ownname VARCHAR2,
Tabname VARCHAR2,
Partname VARCHAR2 default null,
Estimate_percent number default null,
Block_sample boolean default false,
Method_opt VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE
1 ',
Degree number default null,
Granularity VARCHAR2 DEFAULT 'default ',
Cascade boolean default false,
Stattab VARCHAR2 default null,
Statid VARCHAR2 default null,
Statown VARCHAR2 default null,
No_invalidate boolean default false );
This package is most commonly used. It collects statistics on tables. The syntax is shown above. Its key parameters are as follows:
- Method_opt (histogram option)
First, let's talk about what is a histogram. A histogram is used to collect statistics on the data distribution in a column to let the optimizer know the distribution of data in each column. If the data distribution in a column is very skewed, it is better to collect histogram information for this column.
Method_opt has the following options:
- For all columns-> calculate the histogram of all columns in the table
- For all indexed columns-> count the histogram of index columns defined on the table
- For columns <list> SIZE <N> | REPEAT | AUTO | SKEWONLY:
- N: Number of bins in the histogram. value range: [1, 1,254]. 1 is equivalent to not collecting histograms.
- REPEAT: only the columns with original histogram information are collected;
- AUTO: Oracle determines the size of N.
- SKEWONLY: only collects histograms of unevenly distributed columns. The system automatically determines the number of buckets)
Example 1: collect statistics on table t, but do not collect the histogram: exec dbms_stats.gather_table_stats (user, 't', method_opt => 'for all columns size 1 ');
Example 2: Collect the histogram by the maximum number of buckets: exec dbms_stats.gather_table_stats (user, 't', method_opt => 'for all columns size 254 ');
- Granularity (granularity of statistics collected for partition tables)
This parameter is mainly used for partition tables, which have the following three types of statistics:
- Global-> global statistics
- Partition-> partition statistics
- Sub-partition-> subpartition statistics
- Cascade (whether to collect index statistics at the same time)
True is equivalent to running gather_index_stats on all indexes of the table.