Parallel and concurrent collection of statistical information

Source: Internet
Author: User


Overview of parallel and concurrent collection statistics


In order to increase the efficiency of statistical information collection, Oracle has introduced a method of parallel and concurrent collection of statistics as application data increases and the amount of the table is increased.



This article introduces the relevant knowledge content of parallel and concurrent collection statistics and some cases, and focuses on the collection of concurrency statistics information.


Parallel collection of statistics (PARALLEL)


When the size of a table is particularly large, the efficiency of statistical information collection can be accelerated by working together in parallel slave processes.
By default, the database is able to collect parallel statistics based on the degree of parallelism set at the table or index level (default: 1).
But we can also control the parallelism of the parallel statistics collection by setting the degree parameter in the display.


Degree parameters:


The degree parameter is used to control the degree of parallelism of statistical information collection.
You can assign values in the following ways:


1.通过DBMS_STATS.SET_*_PREFS包设置全局变量2.通过DBMS_STATS.GATHER_*_STATS包设置某次执行变量


Cases:




Oracle recommends specifying the degree parameter as dbms_stats. Auto_degree, Oracle determines the degree of parallelism of statistical information collection based on the size of the object and the setting of the parallel parameters.
Cases:


EXEC DBMS_STATS.GATHER_SCHEMA_STATS(ownname =>‘SH‘, DEGREE => DBMS_STATS.AUTO_DEGREE);


Note
Oracle cannot collect certain types of indexes in parallel, such as: Cluster indexes, domain indexes, and bitmap join indexes.


Concurrency collection Statistics (CONCURRENT)


The parallel collection of statistics simply means that the statistical collection of an object is handled with multiple parallel slave, but the processing is sequential for multiple objects (tables, indexes, or partitions).
That is, processing an object before processing the next object.



Starting with 11.2.0.2, Oracle has launched the concurrent collection of statistics (CONCURRENT) mode to make multi-process environments more efficient in order to enable the collection of statistical information for multiple objects at the same time.
That is, multiple jobs are started concurrently, and multiple objects (tables, indexes, or partitions) are processed concurrently.



Starting with 12.1.0.1, Oracle further expands the concurrency collection statistics so that the Oracle Automated Statistics Collection task (automatic statistics gather task) can also benefit from the concurrent collection of statistics.



The concurrent collection of statistics mainly uses the following techniques:


Oracle Scheduler:用于启动多个JobAdvanced Queuing (AQ):用于控制处理的排序Resource Manager :管理使用资源


To enable concurrency collection statistics, you need to set the following parameters:


CONCURRENT:启用并发收集统计信息功能JOB_QUEUE_PROCESSES:最大JOB数RESOURCE_MANAGER_PLAN:启用Resource Manager有效


When statistics are collected concurrently, the number of jobs generated by the database is allocated according to the situation, in most cases the Dbms_stats program assigns a job to each object, but if the size of the object (table or partition) is too small, to conserve resources, Oracle merges multiple tables and partitions to execute in a single job.



Note
To prevent deadlocks when processing partitions on multiple partitioned tables at the same time, the partition table is handled through the queue mechanism.
That is, only one partition table can be processed at a time, and the other needs to wait in queue before processing the next one when the previous partition table is finished.



The following is a sample of the Concurrency collection statistics in the Oracle whitepaper:






We can see from this example that there are different levels of jobs when collecting statistics for concurrency,
For partitioned tables A job is assigned to each partition in addition to a coordination job.
Also, as mentioned earlier, multiple partitioned tables cannot be processed at the same time. The processing of the costs table and the sales table in the figure is sorted, i.e. the processing of the sales table will begin after the job of the costs table is finished.


Concurrent parameters:


The concurrent parameter is used to control concurrency collection statistics. Need to pass dbms_stats. Set_global_prefs to make global settings.



On the 11.2.0.2~11.2.0.4 version, you can set the value:


TRUE  :并发有效FALSE :并发无效。


On the 12c version, you can set the following values:


‘MANUAL‘   :只有当手动收集时,并发有效‘AUTOMATIC‘:只有当自动收集时,并发有效   ‘ALL‘      : 当手动/自动收集,并发都有效‘OFF‘      : 并发无效
Concurrent Execution Example 1 (11.2.0.3):


11.2.0.3 testing on the environment:



1. Preparation of test data:


Sql> Conn scott/tigerconnected.sql> CREATE TABLE Tbl1_nonpart1 (COL1 number, COL2 number, COL3 CHAR (Ten), COL4 Varcha R (Ten), COL5 DATE, PRIMARY KEY (col1,col2));  Table created. Sql> CREATE TABLE Tbl2_nonpart2 (COL1 number, COL2 number, COL3 CHAR (Ten), COL4 VARCHAR, COL5 DATE, PRIMARY KEY (COL   1,col2));  Table created. Sql> CREATE TABLE tbl3_nonpart3 (COL1 number, COL2 number, COL3 CHAR (Ten), COL4 VARCHAR, COL5 DATE, PRIMARY KEY (COL   1,col2));  Table created. Sql> CREATE TABLE Tbl4_part1 (COL1 number, COL2 number, COL3 CHAR (Ten), COL4 VARCHAR, COL5 DATE, PRIMARY KEY (col1,c   OL2) PARTITION by LIST (COL1) (PARTITION PTBL1 values (' 1 '), PARTITION PTBL2 values (' 2 '), PARTITION PTBL3 values (' 3 ')); 2 3 4 5 6 Table created. Sql> begin for I in 1..2000000 loop insert into Tbl1_nonpart1 values (1,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sys    Date);    INSERT into TBL2_NONPART2 values (2,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate); INSERT INTOTbl3_nonpart3 VALUES (3,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate);    INSERT into TBL4_PART1 values (1,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate);    INSERT into TBL4_PART1 values (2,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate); INSERT into TBL4_PART1 values (3,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate); commit; End Loop; End /PL/SQL procedure successfully completed.


2. Set the relevant parameters:


---2.1.使Resource Manager有效(DEFAULT_PLAN)SQL> conn /as sysdbaConnected.SQL> ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = ‘DEFAULT_PLAN‘;System altered.---2.2.查看JOB_QUEUE_PROCESSES设置,如果需要可以改变SQL> SELECT VALUE FROM v$parameter WHERE NAME=‘job_queue_processes‘;VALUE--------------------------------------------------------------------------------1000---2.3.设置CONCURRENT为TRUE  ※SQL> SELECT DBMS_STATS.GET_PREFS(‘CONCURRENT‘) FROM DUAL;DBMS_STATS.GET_PREFS(‘CONCURRENT‘)--------------------------------------------------------------------------------OFFSQL> EXEC DBMS_STATS.SET_GLOBAL_PREFS(‘CONCURRENT‘,‘TRUE‘);PL/SQL procedure successfully completed.SQL> SELECT DBMS_STATS.GET_PREFS(‘CONCURRENT‘) FROM DUAL;DBMS_STATS.GET_PREFS(‘CONCURRENT‘)--------------------------------------------------------------------------------TRUE


3. Performing statistical information collection


SQL> conn /as sysdbaSQL> EXEC DBMS_STATS.GATHER_SCHEMA_STATS(‘SCOTT‘);PL/SQL procedure successfully completed.


4. Monitoring results


SQL> select job_name, state, commentsfrom dba_scheduler_jobswhere job_class like ‘CONC%‘;JOB_NAME       STATE                                         COMMENTS-------------- --------------------------------------------- ------------------------ST$SD1_7       RUNNING                                       "SCOTT"."TBL3_NONPART3".ST$SD1_6       RUNNING                                       "SCOTT"."TBL2_NONPART2".ST$SD1_5       RUNNING                                       "SCOTT"."TBL1_NONPART1".....
Concurrent Execution Example 2 (12.1.0.2):


12.1.0.2 testing on the environment:



1. Preparation of test data:


Sql> Conn scott/tigerconnected.sql> CREATE TABLE Tbl1_nonpart1 (COL1 number, COL2 number, COL3 CHAR (Ten), COL4 Varcha R (Ten), COL5 DATE, PRIMARY KEY (col1,col2));  Table created. Sql> CREATE TABLE Tbl2_nonpart2 (COL1 number, COL2 number, COL3 CHAR (Ten), COL4 VARCHAR, COL5 DATE, PRIMARY KEY (COL   1,col2));  Table created. Sql> CREATE TABLE tbl3_nonpart3 (COL1 number, COL2 number, COL3 CHAR (Ten), COL4 VARCHAR, COL5 DATE, PRIMARY KEY (COL   1,col2));  Table created. Sql> CREATE TABLE Tbl4_part1 (COL1 number, COL2 number, COL3 CHAR (Ten), COL4 VARCHAR, COL5 DATE, PRIMARY KEY (col1,c   OL2) PARTITION by LIST (COL1) (PARTITION PTBL1 values (' 1 '), PARTITION PTBL2 values (' 2 '), PARTITION PTBL3 values (' 3 ')); 2 3 4 5 6 Table created. Sql> begin for I in 1..4000000 loop insert into Tbl1_nonpart1 values (1,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sys    Date);    INSERT into TBL2_NONPART2 values (2,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate); INSERT INTOTbl3_nonpart3 VALUES (3,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate);    INSERT into TBL4_PART1 values (1,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate);    INSERT into TBL4_PART1 values (2,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate); INSERT into TBL4_PART1 values (3,lpad (i,10, ' 0 '), ' 1234500001 ', ' abcdefghij ', sysdate); commit; End Loop; End /PL/SQL procedure successfully completed.


2. Set the relevant parameters:


---2.1.使Resource Manager有效(DEFAULT_PLAN)SQL> conn /as sysdbaConnected.SQL> ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = ‘DEFAULT_PLAN‘;System altered.---2.2.查看JOB_QUEUE_PROCESSES设置,如果需要可以改变SQL> SELECT VALUE FROM v$parameter WHERE NAME=‘job_queue_processes‘;VALUE--------------------------------------------------------------------------------1000---2.3.设置CONCURRENT为ALLSQL> SELECT DBMS_STATS.GET_PREFS(‘CONCURRENT‘) FROM DUAL;DBMS_STATS.GET_PREFS(‘CONCURRENT‘)--------------------------------------------------------------------------------OFFSQL> EXEC DBMS_STATS.SET_GLOBAL_PREFS(‘CONCURRENT‘,‘ALL‘);PL/SQL procedure successfully completed.SQL> SELECT DBMS_STATS.GET_PREFS(‘CONCURRENT‘) FROM DUAL;DBMS_STATS.GET_PREFS(‘CONCURRENT‘)--------------------------------------------------------------------------------ALL


3. Performing statistical information collection


SQL> conn /as sysdbaSQL> EXEC DBMS_STATS.GATHER_SCHEMA_STATS(‘SCOTT‘);PL/SQL procedure successfully completed.


4. Monitoring results


sql> SET linesize 200sql> column TARGET format a25sql> column target_type FORMAT a25sql> column Job_name FORMA T a14sql> COLUMN start_time FORMAT a40sql> SELECT TARGET, Target_type, Job_name, To_char (start_time, ' dd-mon-y               YYY Hh24:mi:ss ') from dba_optstat_operation_taskswhere STATUS = ' in PROGRESS ' and opid = (SELECT MAX (ID)  From dba_optstat_operations WHERE operation = ' gather_schema_stats '); 2 3 4 5 6 7 TARGET target_type job_name to_char (start_time, ' D------ -------------------------------------------------------------------------------SCOTT. Tbl4_part1 TABLE (Coordinator JOB) St$sd42_2 18-jun-2016 14:07:56scott. Tbl4_part1 TABLE (GLOBAL STATS only) st$t44_2 18-jun-2016 14:07:57scott. sys_c0010346 INDEX st$t44_2 18-jun-2016 14:08:11scott. Tbl1_nonpart1 TABLE st$sd42_1_b7 18-jun-2016 14:07:58scott. sys_c0010343 INDEX st$sd42_1_b7 18-jun-2016 14:08:09
Monitoring of concurrency statistics collection


Concurrency statistics collection can be monitored through the following views


DBA_OPTSTAT_OPERATION_TASKS:当前和历史的统计信息收集的执行任务 (12c)DBA_OPTSTAT_OPERATIONS     :当前和历史的统计信息收集的执行操作 (12c)DBA_SCHEDULER_JOBS         : SCHEDULER JOBS信息
Parallel VS concurrency


Sometimes there may be some confusion about the concept of parallel and concurrency statistics collection, and here's a table to summarize the comparison of parallel and concurrency statistics collection:





Concurrency and parallel execution statistics collection combinations


For increased efficiency, concurrent and parallel execution of statistical information collection can be effective, and this combination is very efficient for very large tables and partitions.
For the combination to work, you need to set the Parallel_adaptive_multi_user parameter to False to prevent the parallelism from being invalidated by an adaptive result.



Cases:


ALTER SYSTEM SET PARALLEL_ADAPTIVE_MULTI_USER=false;


For Parallel_adaptive_multi_user parameters, refer to the online documentation:



Database Reference
Parallel_adaptive_multi_user


Consulting Case:


In the author's experience, have encountered the following several consulting cases, here on the survey methods and results to share some.


Does the SE version concurrency statistics collection work?


Although the techniques used by concurrent execution statistics collection include Resource Manager, and Resource Manager is a feature that is used by the Enterprise Edition version, standard Some of the actions inside the edition also use the Resource Manager function, so it is also possible to perform concurrent statistics collection without explicitly using Resource manager.
This conclusion can also be verified by running the above example in Standard Edition.


Set the relevant content, but the concurrency statistics collection is invalid?


There are user inquiries, why my environment has set up the concurrency of statistical Information collection, view the related view but found that the statistical information collected is not executed concurrently?
This problem, in fact, due to concurrent execution, Oracle internal actually has a certain threshold value set. When these tables are small and other environmental factors, Oracle merges multiple tables and partitions in a single job for batch execution, resulting in the absence of concurrent execution of statistical information collection.



We can see some of the relevant content by tracking dbms_stats.
Cases:


SQL> conn /as sysdbaSQL> exec dbms_stats.set_global_prefs(‘TRACE‘, 4+8+16+128+2048);PL/SQL procedure successfully completed.SQL> EXEC DBMS_STATS.GATHER_SCHEMA_STATS(‘SCOTT‘);PL/SQL procedure successfully completed.SQL> exec dbms_stats.set_global_prefs(‘TRACE‘, 0);PL/SQL procedure successfully completed.


To view the related trace files:


DBMS_STATS: Estimate cost for target:  ownname: "SCOTT" tabname: "TBL2_NONPART2" partname:  subpartname:  type: 1DBMS_STATS:


As we can see, some comparisons are made during dbms_stats execution, and if the pre-estimate does not meet the threshold, it will be processed in batches.



Note
This critical value (batching threshold) is influenced by many factors and varies depending on the environment.


How do I restrict concurrency statistics collection to only a subset of tables?


In the past consulting cases, some customers did want to collect concurrency statistics for only a subset of the tables in a schema.
This requirement can be achieved by dbms_stats the "obj_filter_list" parameter.



Cases:


DECLARE      filter_lst  DBMS_STATS.OBJECTTAB := DBMS_STATS.OBJECTTAB();      obj_lst     DBMS_STATS.OBJECTTAB := DBMS_STATS.OBJECTTAB();BEGIN  filter_lst.extend(5);  filter_lst(1).ownname := ‘SH‘;  filter_lst(1).objname := ‘SALES‘;  filter_lst(2).ownname := ‘SH‘;  filter_lst(2).objname := ‘COSTS‘;  filter_lst(3).ownname := ‘SH‘;  filter_lst(3).objname := ‘SALES2‘;  filter_lst(4).ownname := ‘SH‘;  filter_lst(4).objname := ‘COSTS2‘;  filter_lst(5).ownname := ‘SH‘;  filter_lst(5).objname := ‘SALES3‘; DBMS_STATS.GATHER_SCHEMA_STATS(ownname=>‘SH‘,objlist=>obj_lst, obj_filter_list=>filter_lst);END;/


For this, you can refer to a blog that Maria Colgan-oracle wrote about:
Oracle Optimizer Blog
>how do I restrict concurrent statistics gathering to a small set of tables from a single schema?


Reference:


Faq:gathering Concurrent Statistics Using Dbms_stats Frequently asked Questions (Doc ID 1555451.1)



Database SQL Tuning Guide
>gathering Optimizer Statistics concurrently



Oracle Optimizer Blog
>concurrent Statistics Gathering
>how do I restrict concurrent statistics gathering to a small set of tables from a single schema?



★white Papers
Understanding Optimizer Statistics with Oracle Database 12c-part 1


Improving the efficiency of gathering Statistics

Concurrent Statistic Gathering


Best practices to gathering Optimizer Statistics with Oracle Database 12c-part 2


Improving the efficiency of gathering statistics

Intra Object Parallelism
Inter Object Parallelism


Parallel and concurrent collection of statistical information


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.