Automatic Collection of Oracle 10g statistical information (automatic statistics gathering)

Source: Internet
Author: User
1. What is statistics:

Optimization statistics are collected to describe the database and database objects in more detail. These statistics are used to select the optimal execution plan for SQL statements. The optimized statistics include:

Table statistics (Table statistics): number of rows, number of blocks, average row length;

Column statistics (column statistics): number of distinct values (Newcastle Disease) in column, number of nulls in column, data distribution (histogram );

Index statistics (index statistics): number of leaf blocks, levels, clustering factor;

System statistics (system statistics): I/O performance and utilization, CPU performance and utilization.

The statistical information of tables, columns, and indexes can be collected automatically. The statistical information of the system can only be collected manually in Oracle 10 Gb.

2. How to get statistics:

In orcale 10 Gb, statistics is collected by the gather_stats_job job, only when there is no statistical information on the database object or the statistical information has expired (Oracle 10 Gb determines whether the database object has been modified to more than 10% rows, Which is tracked by modification Monitoring) the job is automatically created by scheduler when the database is created or upgraded. These jobs can be found in view dba_scheduler_jobs.

Example: Select D. Owner, D. job_name, D. program_name, D. schedule_type, D. Comments from dba_scheduler_jobs d

Result:

By default, scheduler runs the gather_stats_job job in the maintenance window (Maintenance window, which is started from AM to AM and Sunday, the job gather_stats_job is to call the internal system process dbms_stats.gather_database_stats_job_proc to complete information statistics. This process can be based on the priority of the database object statistics information requirements (that is, the number of database objects modified) collects statistics in sequence. Whether the gather_stats_job is closed with the closure of the maintenance window is determined by the stop_on_window_close attribute. The default value of stop_on_window_close is true. In this case, the gather_stats_job is closed with the closure of the maintenance window. The collection of statistical information is a resource-intensive task, so you may want to ensure that it does not affect the normal operation of the database.

Statistical information collection can also be done by using the gather auto option in the process dbms_stats.gather_database_stats. Compared with the internal process scheduler in the system, dbms_stats.gather_database_stats does not distinguish the priority of the database object statistical information requirements.

In non-default cases, Oracle10g can control whether to enable automatic statistics collection by setting the initialization parameter statistic_level. The information of the statistic_level parameter is as follows (Table-1 ):

Parameter type String
Syntax STATISTICS_LEVEL = {ALL | TYPICAL | BASIC}
Default Value TYPICAL
Parameter category DynamicALTER SESSION,ALTER SYSTEM

(Table-1)

This parameter is used to control the level of database statistics collection. When typical is the default value, the system automatically collects all the major information about its management to provide optimal performance. This value is suitable for the vast majority of cases. When the value is all, timed OS statistics and plan execution statistics are added to the typical value system. When the value is basic: many information statistics functions related to system features and functions will be disabled (see Appendix 1 for details ). Therefore, Oracle strongly recommends ParametersSTATISTICS_LEVELDo not set the valueBasic. At the same time, when the parameter is modified with the syntax "alter ",
System set statistics_level = 'typical'; "the modified statistics_level takes effect for the entire system.ALTER SESSION,The range of statistics_level is only for thisSESSIONThe status information of all statistics and report functions controlled by the statistic_level parameter can be obtained from view v $ statistics_level. For more information about the meaning of each field of view v $ statistics_level, see the overview page.

Example:

Begin

Execute immediate 'alter session set statistics_level = all ';

End; -- set this session statistics_level to 'all'

Select v. statistics_name, V. session_status, V. system_status, V. activation_level, V. session_settable from V $ statistics_level V;

The result is as follows:

Begin

Execute immediate 'alter session set statistics_level = Basic ';

End; -- set the statistics_level of this session to 'basic'

Select v. statistics_name, V. session_status, V. system_status, V. activation_level, V. session_settable from V $ statistics_level V;

The result is as follows:

3. Save statistics of previous versions

When the optimizer collects statistics, it may appear that the original optimization method has been working well before statistics are collected. However, due to the new statistics, there is a bad plan, this can cause a sudden query error or lower query efficiency. To avoid this situation, the statistics collection job saves the current statistics before collecting new information. If a problem occurs, you can return the original statistical information or check the difference between the two through historical statistics to solve the problem.

For example, if the statistical information collection job on the table EMP is run at on the evening of June 1, May 31, the performance of subsequent queries becomes worse. Oracle saves the original statistical information and can re-obtain it by executing the following command:

Begin

Dbms_stats.restore_table_stats (

'Arup ',

'Revenue ',

'10-DEC-08 10.00.00.000000000 pm-04:00 ');

End;

This command restores the statistical information until, January 1, December 10. The time information is provided in the timestamp data type. The length of time for historical statistics to be restored is determined by the retention parameter. To view the current retention parameters, use the following query:

SQL> select dbms_stats.get_stats_history_retention from dual;

Get_stats_history_retention

---------------------------

31

Indicates that the statistical information equivalent to 31 days can be saved, but cannot be guaranteed. To know the exact time and date covered by the statistical information, you only need to use the following query:

SQL> select dbms_stats.get_stats_history_availability from dual;

Get_stats_history_availability

---------------------------------------------------------------------

10-dec-08 09.21.33.594053000 pm-04:00

This query indicates that the earliest available statistical information date is, January 1, December 10. You can also set the retention time to a different value by executing the built-in function. For example, to set it to 45 days, you can use:

Execute dbms_stats.alter_stats_history_retention (45)

4. View statistics

Statistics of tables and index hit columns are stored in the data dictionary. You can view the corresponding statistics by selecting some fields in the data dictionary view, for details about the data dictionary view, see appendix 2. The meanings of fields in each view can be found on the webpage: preview.

The column statistics are used as a column to view the column statistics.

Column statistics can be stored in the form of columnar statistical charts, which provide accurate description information for column data, especially when the data column is skewed (there are many rows of records for a certain value, and the number of records for a certain value is very small. Oracle contains two columnar statistical charts: height-balanced and frequency histograms. These charts are stored in the * tab_col_statistics (* user or dBA) view ), the value is height balanced, frequency, or none.

(1) height Histogram

In the height histogram, the values of Data columns are divided into groups, and each group contains only a handful of data. For example, if a column C has a value between 1 and 100, the height histogram of the column is as follows:

Each interval contains 10 rows of data in the data column. The column histogram is shown in figure

In this case, the value of most data rows is 5. The most valuable statistical information during query is the values of the two endpoints in each group range.

The syntax for viewing statistics is as follows:

Begin

Dbms_stats.gather_table_stats (ownname => 'Scott ', tabname => 'emp ',

Method_opt => 'for columns size 6 Sal ');

End;

/

Select column_name, num_distinct, num_buckets, Histogram

From user_tab_col_statistics

Where table_name = 'emp' and column_name = 'sal ';

Result:

Select endpoint_number, endpoint_value

From user_histograms

Where table_name = 'emp' and column_name = 'sal'

Order by endpoint_number;

Result:

, Each row corresponds to each interval in the height histogram.

(2) frequency Histogram

In the frequency histogram, each unique data in the column is equivalent to each interval in the height histogram, and the number of times the data should appear in the column. When the number of distinct entries in a column is smaller than or equal to the number of shard segments of the histogram (that is, the value of num_buckets), the frequency histogram is automatically created. The syntax for viewing the frequency histogram is as follows:

Begin

Dbms_stats.gather_table_stats (ownname => 'Scott ', tabname => 'emp ',

Method_opt => 'for columns size 12 Sal ');

End;

Select column_name, num_distinct, num_buckets, Histogram

From user_tab_col_statistics

Where table_name = 'emp' and column_name = 'sal ';

Result:

Select endpoint_number, endpoint_value

From user_histograms

Where table_name = 'emp' and column_name = 'sal'

Order by endpoint_number;

Result:

5. Notes for using the statistics collection function

Note 1: As mentioned above, the modification monitoring function is started with statistics_level set to 'typical' or 'all, the automatic statistics collection function determines whether to collect statistics for the table based on the monitoring information. If the modification monitoring function is disabled, if the statistical information is automatically collected, you cannot confirm whether the statistical information of the table has expired. Therefore, you need to manually collect the statistical information.

In addition, the user_tab_modifications table records information about the changes to data in all monitored tables. The update of this information will be slightly slower than the actual modification. You can use the dbms_stats.flush_database_monitoring_info stored procedure to immediately update the changed information to the user_tab_modifications table. For the record whose rollback operation is performed after the update, it is still regarded as an affected record. Oracle will not update the user_tab_modifications table after the rollback operation. Therefore, this should be noted.

For example ):

SQL> select * From user_tab_modifications where table_name = 'emp ';

-- No rows selected

SQL> select count (*) from EMP;

Count (*)

----------

14

SQL> Update EMP set sal = Sal + 100;

14 rows updated.

SQL> select * From user_tab_modifications where table_name = 'emp ';

-- No rows selected

SQL> exec dbms_stats.flush_database_monitoring_info ();

PL/SQL procedure successfully completed.

SQL> select inserts, updates, deletes from user_tab_modifications where table_name = 'emp ';

Inserts updates deletes

------------------------------

0 14 0

SQL> rollback;

Rollback complete.

SQL> exec dbms_stats.flush_database_monitoring_info ();

PL/SQL procedure successfully completed.

SQL> select inserts, updates, deletes from user_tab_modifications where table_name = 'emp ';

Inserts updates deletes

------------------------------

0 14 0

NOTE 2: Oracle strongly recommends ParametersSTATISTICS_LEVELDo not set the valueSo when you need to disable the automatic statistics function, you 'd better use the following methods:

Method 1: sysdba Logon

Exec dbms_scheduler.disable ('sys. gather_stats_job ');

Exec dbms_scheduler.enable ('sys. gather_stats_job ');

Method 2: Log On As sysdba

Alter system set "_ optimizer_autostats_job" = false scope = spfile;

Alter system set "_ optimizer_autostats_job" = true scope = spfile;

Pfile can directly modify the initialization parameter file

Then restart the database.

NOTE 3:

Not all database objects are suitable for automatic collection of statistical information. For example, the automatic collection of statistical information cannot meet the requirements in the following situations:

(1) Some tables are deleted (delete) or truncated (truncate) during working hours and re-created;

(2) Some heavy duty tables are modified in large quantities (insert and update) during working hours, and the modification volume exceeds 10%.

For tables with high modification frequency, you can use the following two methods to keep their statistical information out of date:

The following method can be used for (1:

Method 1: Set the statistical information of these tables to null, so that Oracle dynamically collects the statistical information of these non-statistical information tables based on a part of the query optimization. The dynamic statistics collection function of Oracle is controlled by the optimizer_dynamic_sampling parameter. The dynamic statistics collection function of Oracle can be started only when the value of this parameter is greater than or equal to 2 (default value: 2.

To set the statistical information of a database object to null, you only need to delete the original statistical information and lock the statistical information function. The syntax is as follows:

Begin

Dbms_stats.delete_table_stats ('oe ', 'Orders'); -- delete statistics

Dbms_stats.lock_table_stats ('oe ', 'Orders'); -- no statistical information is collected.

End;

Similar locks and locks include lock_schema_stats, lock_table_stats, unlock_schema_stats, and unlock_table_stats.

Method 2: fix the typical statistical information of the database table that best represents the table status as the optimized statistical information of the table. We can collect and lock the typical information at any time to achieve our goal, such statistical information may better express the status of a data table than the statistical information collected at night by the automatic statistical information collection function.

For question (2), the following methods can be used:

For tables with a large modification volume, it is best to collect statistics immediately after the modification operation, which is manually completed as part of the SQL program or database job;

Note 4:

Manual collection of statistical information is required in the following cases.

(1) For external tables, the statistical information is not collected in the job gather_schema_stats, gather_database_stats, and the job that automatically collects statistical information, because data operations are prohibited on External tables, therefore, statistical information of External tables must be collected manually when the corresponding data changes.

(2) When the automatic collection function of statistical information is stopped.

(3) there is also system statistics.

(4) fixed objects, such as dynamic performance tables, need to be collected. When the database has obvious operations, the gather_fixed_objects_stats process is used.

6. The automatic collection of statistical information function Oracle 11i is different from 10 GB

(1)In Oracle 10 Gb, there may be some situations. You need to use your own scripts to collect statistics on some special objects. However, since you use automatic collection of statistics, Oracle will use the same options for all objects to collect statistics, so that you lose control of an object. When you find that the default statistical information collection method is not very timely for an object, you must lock the statistical information of the object and use a special option value to collect statistical information for the object.

For example, the data skew of columns in a table is very serious. If the standard sampling rate is adopted: estimate_perccent = auto_sample_size, it may not be suitable. In this case, you need to specify the sampling rate of the object separately. We know that, in terms of collecting statistics before 11 GB, other similar options provided by Oracle include cascade, degree, method_opt, no_invalidate, and granularity. At 11g, it provides more flexibility, so that you can easily handle the situation mentioned above. In 11g, the options mentioned above can be set at different levels, from high to low: global level, database level, schema level, table level. The low-level options overwrite the high-level options.

For example, for the example above, if you want to collect statistics for a table with a special column value skew that is very serious, you only need to call the following stored procedure to set estimate_perccent = 100 at the table level, as shown below:

SQL> exec dbms_stats.set_table_prefs ('schema _ name', 'table _ name', 'estimate _ perccent ', '123 ');

After this setting, when the database automatically collects statistics, the sampling rate will use auto_sample_size for other tables that do not set the sampling rate separately. For the table_name table that you set separately, the sampling rate of 100 is used to collect statistics. Similarly, if you need to set global options, call dbms_stats.set_global_prefs; if you want to set options at the database level, call dbms_stats.set_database_prefs; if you want to set options at the schema level, call dbms_stats.set_schema_prefs.

(2)In 11g, in addition to the preceding options, three new options are added: publish, incremental, and stale_percent. Where:

1) publish: whether to publish the statistics to the data dictionary immediately after collecting the statistics, or store them in the private area. True indicates immediate release, and false indicates storage in the private zone.

2) stale_percent: determines the maximum number of out-of-date statistical information of an object. If out-of-date statistics are collected, the default value is 10. When calculating whether the statistical information of a table is outdated, Oracle calculates the percentage of the number of modified data rows in the table since the previous statistical information of the table was collected. Then compare the obtained percentage value with the value configured in this option (if the default value is 10). If the value is greater than 10, the statistical information of the table is outdated, the statistical information needs to be collected again; otherwise, the statistical information of the table is considered to be out of date and no need to be collected again.

3) incremental: When collecting global statistics on a partitioned table (set granularity to global), incremental processing is performed. This option is used because for some partition tables, for example, partition tables with a range partition by month, in addition to representing frequent changes in the data in the partition of the current month, data in other partitions will not change. Therefore, when collecting the global statistics on the partition table, it is unnecessary to scan the partitions that are not in the current month again. If you set incremental to true, the data in the partitions not in the current month will not be scanned when collecting statistics, instead, only the data in the partition of the current month will be scanned. Finally, the global statistics of the partition table is obtained by combining the statistical information of the partitions that are not in the current month with the newly calculated statistical information of the current month. You can view the values of each option when collecting statistics from the View: dba_tab_stat_prefs.

(3)There is no known method for modifying the 10% threshold value in Oracle10g version (including the latest 10.2.0.4. However, functions such as set_table_prefs are provided in oracle11g.

The following command changes the default stale value of the specified table from 10% to 5%. This value can be obtained from the new dba_tab_stat_prefs data dictionary.

-- Only for oracle11g

Begin

Dbms_stats.set_table_prefs (ownname => 'Scott ', tabname => 'emp', pname => 'sal', pvalue => '5 ');

End;

/

SQL> select * From dba_tab_stat_prefs;

Owner table_name preference_name prefe

---------------------------------------------

Scott EMP Sal 5

7. Issues to be resolved

This article provides a detailed explanation of the automatic collection of Oracle 10 Gb statistical information, with some practical practices. However, because the volume of data in the local database is very limited, there is still a lack of experiments on how the automatic collection of statistical information can improve the efficiency of the database system. We will note this point in later studies and development, the impact of the automatic statistics collection function on the system efficiency may be explained based on the actual application.

Appendix 1:
1 Automatic workload repository (AWR) snapshots
2 Automatic database diagnostic monitor (ADDM)
3 All server-generated alerts
4 Automatic SGA Memory Management
5 Automatic optimizer statistics collection
6 Object level statistics
7 End to end application tracing (V $ client_stats)
8 Database Time Distribution Statistics (V $ sess_time_model and V $ sys_time_model)
9 Service level statistics
10 Buffer cache Advisory
11 MTTR Advisory
12 Shared Pool sizing Advisory
13 Segment level statistics
14 PGA target Advisory
15 Timed statistics
16

Monitoring of statistics

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.