Oracle Performance Analysis 12: Object statistics

Source: Internet
Author: User

Object statistics Describe how data is stored in a database, and the query optimizer uses these statistics to make the right decisions. There are three types of object statistics in Oracle: Table statistics, column statistics, and index statistics. In each type, there are fine-grained: Table or index-level statistics, partition-level statistics, and sub-partition-level statistics, and the following two are available only if the object is partitioned and has child partitions.

Statistical information related View chart statistics

Table/index-level statistics

User_tab_statistics
User_tables

Partition-Level statistics

User_tab_statistics
User_tab_partitions

Sub-partition level statistics

User_tab_statistics
User_tab_subpartitions

Column statistical information

Table/index-level statistics

User_tab_col_statistics
User_tab_histograms

Partition-Level statistics

User_part_col_statistics
User_part_histograms

Sub-partition level statistics

User_subpart_col_statistics
User_subpart_histograms

Index statistical information

Table/index-level statistics

User_ind_statistics
User_indexes

Partition-Level statistics

User_ind_statistics
User_ind_partitions

Sub-partition level statistics

User_ind_statistics
User_ind_subpartitions

Create a test table

Here you will create a description of the test table T used for post-facing statistics.

Create a test table
CREATE TABLE test as select RowNum as ID,       round (Dbms_random.normal *) as VAL1,       + round (ln (rownum/3.25 +  2) as Val2,       + round (ln (rownum/3.25 + 2)) as Val3,       dbms_random.string (' P ', +) as pad from  all_objects where RowNum <= the order by Dbms_random.value

The above statement creates a 1000-row table, and then we empty the negative values in the Val1 column:

Update Test Set val1 = null where Val1 < 0;
To add a primary key and index to a test table
ALTER TABLE TEST ADD constraint TEST_PK primary key (ID), CREATE INDEX test_val1 on test (VAL1), create index test_val2 on t EST (val2);
Collect Statistics for test tables
Begin  Dbms_stats.gather_table_stats (ownname          = user,                                tabname          = ' TEST ',                                estimate_ Percent = +,                                method_opt       = ' For all columns size skewonly ',                                cascade          = TRUE); end;
Table statistical Information

The following are the key fields in the table statistics:

Select Num_rows, blocks, Empty_blocks, Avg_space, chain_cnt, Avg_row_len from  user_tab_statistics WHERE table_name = ' TEST '; Num_rowsblocksempty_blocksavg_spacechain_cntavg_row_ LEN----------------------------------------------------------------------------------100039000265

Here is an explanation of the meaning of the field:

1) Num_rows: The number of rows of data in the table;
2) Blocks: The number of data blocks below the high watermark (see "Oracle Performance Analysis 4: Full scan of data access method" http://blog.csdn.net/tomato__/article/details/38981425);
3) Empty_blocks: The number of data blocks above the high watermark, because Dbms_stats does not calculate the value, therefore 0;
4) Avg_space: The average free space (bytes) of the table data block, because Dbms_stats does not calculate the value, so it is 0;
5) CHAIN_CNT: The total number of rows involved in the row link and row migration, because Dbms_stats does not calculate the value, therefore 0 (see "Oracle Row Migration and Row link" http://blog.csdn.net/tomato__/article/details/40146573);
6) Avg_row_len: The average length (in bytes) of each record in the table.

Column statistical information

The following are the most important statistics fields for column statistics:

Select column_name,       num_distinct,       low_value,       high_value,       density,       num_nulls,       Avg_col_ Len,       histogram,       num_buckets from  user_tab_col_statistics where table_name = ' TEST ';

The following is an explanation of these fields:
1) Num_distinct: The number of different values in the column;
2) Low_value: The minimum value of the column. Displays the format for internal storage, storing only the first 32 bytes for string columns;
3) High_value: The maximum value of the column. Displays the format for internal storage, storing only the first 32 bytes for string columns;
4) A decimal between density:0 and 1. Close to 0 means that the filter operation for the column can remove most rows. Close to 1 indicates that the filter operation for the column does not work.
If there is no histogram, the value is computed as: density=1/num_distinct.
If there is a histogram, there are different methods of calculation based on the different histogram types.
5) Num_nulls: The total number of NULL stored in the column;
6) Avg_col_len: Average column size, expressed in bytes;
7) Histogram: Indicates whether there are histogram statistics, values include: none (None), FREQUENCY (frequency type) and height BALANCED (average distribution type);
8) Num_buckets: The number of buckets in the histogram, the minimum is 1, the maximum is 254.
Note: Low_value and High_value are represented as internally stored formats, and the following stored procedure can get the maximum minimum value for all columns of the test table:

DECLARE  l_val1 Test.val1%type;begin  for V in (select Low_value, high_value from              user_tab_col_statistics             WHERE table_name = ' TEST ') loop    dbms_stats.convert_raw_value (V.low_value, l_val1);    Dbms_output.put_line (' low value: ' | | l_val1);    Dbms_stats.convert_raw_value (V.high_value, l_val1);    Dbms_output.put_line (' low value: ' | | l_val1);  End Loop;end;
Histogram

The query optimizer needs to find the number of rows of data that meet the criteria, and if the data for the column is evenly distributed, it is easy to calculate based on the minimum, maximum, and unique values, which are available in the column statistics. However, if the data is not evenly distributed, the query optimizer needs additional information to make the correct estimate.
The additional information required by these query optimizers about uneven distribution of data is called histograms, and there are two types of histograms: the frequency histogram (frequency histogram) and the contour histogram (height-balanced histogram).

Frequency histogram

The intrinsic characteristics of the frequency histogram are as follows:
1) The number of buckets (that is, the number of classes) equals the total number of unique values. For each bucket, the view user_tab_histograms has a row of data corresponding to it;
2) The column Endpoint_value provides the value itself. The column number type, which should be converted to columns of this non-numeric type, takes only the first six bytes. This means that the distribution of the values stored in the histogram is based on the preceding part of the column, so that a fixed-prefix string can cause the histogram to be heavily unbalanced;
3) column Endpoint_number is the number of cumulative occurrences of the value, the current Endpoint_number minus the previous endpoint_number, which is the number of occurrences of the current row value.
The frequency of column val2 can be obtained in the following ways:

Select column_name,       endpoint_value,       endpoint_number,       endpoint_number-lag (endpoint_number, 1, 0) over (  Order by Endpoint_number) as frequency from  user_tab_histograms where table_name = ' TEST ' and   column_name = ' VAL2 ' ORDER BY Endpoint_numbercolumn_nameendpoint_valueendpoint_ Numberfrequency-------------------------------------------------------val210188val21023325val210310168val2104286185val210 5788502val21061000212

The test table below is used as an example of how the optimizer uses the frequency histogram to accurately estimate the number of rows returned by the query:

Explain plan set statement_id ' 101 ' for SELECT * from test where val2 = 101;explain plan set statement_id ' 102 ' for select  * FROM test where val2 = 102;explain plan set statement_id ' 103 ' for SELECT * from test where val2 = 103;explain plan Set statement_id ' 104 ' for SELECT * from test where val2 = 104;explain plan set statement_id ' "" for SELECT * from Test whe Re val2 = 105;explain plan set statement_id ' 106 ' for SELECT * from test where val2 = 106;

Then we look at the estimate of the number of rows returned by the execution plan:

Select Statement_id,cardinality from plan_table where id = 0; Statement_idcardinality----------------------------------10181022510368104185105502106212
Histogram of equal height

When the number of unique values in a column is always greater than the maximum allowable number of buckets (254), the frequency histogram cannot be used, which is only used for the histogram of the same height.
The main features of the high histogram are as follows:
1) Fewer barrels than the total number of unique values. Unless compressed, corresponding to each bucket, the view User_tab_histograms has a line containing the endpoint number (endpoint numbers) corresponding to it, the endpoint number 0 indicates the minimum value;
2) The endpoint value (Endpoint_value) is the value of the column. Because the column is of type number, non-numeric types must be converted, and this value takes only the first six bytes;
3) The Endpoint_number column gives the bucket number;
4) The histogram does not store the frequency of a value.
The equal height histogram stores only the column values that belong to one bucket, and if there are two column values in the same bucket, one of them is ignored (compressed), and such statistics can result in inaccurate estimates. In practice, an equal-height histogram can lead to incorrect estimates and may cause instability in the query optimizer's valuation.

Index statistical information

The following query can get index statistics:

Select Index_name,       blevel,       leaf_blocks,       Distinct_keys,       num_rows,       clustering_factor,       avg _leaf_blocks_per_key,       Avg_data_blocks_per_key from  user_ind_statistics where table_name = ' TEST ';

The main fields have the following meanings:
1) Blevel: The number of branch blocks that need to be read in order to access the leaf block, including the root block;
2) Leaf_blocks: Number of leaf blocks in the index;
3) Distinct_keys: The total number of unique key values in the index;
4) Num_rows: The number of key values in the index;
5) Clustering_factor: See "Oracle Performance Analysis 8: Using the index" http://blog.csdn.net/tomato__/article/details/39294655;
6) Avg_leaf_blocks_per_key: The average number of leaf blocks storing a key value, the formula is as follows;
Avg_leaf_blocks_per_key = Leaf_blocks/distinct_keys
7) Avg_data_blocks_per_key: The average number of data blocks referenced by a single key in the table, with the following formula:
Avg_data_blocks_per_key = Clustering_factor/distinct_keys


Oracle Performance Analysis 12: Object statistics

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.