Oracle Dynamic Sampling Learning

Last Update:2017-04-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Dynamic sampling, a technique introduced in Oracle 9i Release 2, was introduced to help the optimizer generate a better execution plan in case the database object is not parsed (statistical information is missing). Simply put, in case the database Segment (table, Index, partition) object is not analyzed, a technique invented to enable the CBO optimizer to get enough information to ensure that the optimizer makes the correct execution plan. It parses chunks of data on a certain number of segments of the object to get the statistics required by the CBO. Dynamic sampling technology is only a supplement to statistical information, it can not completely replace statistical information analysis.

Note: Dynamic sampling is known as sampling before Oracle 11g, and Oracle 12c is renamed dynamic statistic.

oracle11g R2 The default sampling level:

Sql> Show Parameter optimizer_dynamic_sampling

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

Optimizer_dynamic_sampling Integer 2

Sql> Show parameter Dynamic statistic

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

Optimizer_dynamic_sampling Integer 2

There are 11 levels of dynamic sampling: Check the official documentation yourself

Http://docs.oracle.com/cd/E11882_01/server.112/e41573/stats.htm#PFGRF30101

Dynamic Sampling Experiment:

1. Create Test table

Sql> CREATE TABLE Test as SELECT * from Dba_objects;

Table created.

Sql> Select COUNT (1) from test;

COUNT (1)

----------

86259

2. Do not use dynamic sampling to view the execution plan

Sql> set Autotrace traceonly explain;

Sql> Select/*+ dynamic_sampling (Test 0) */* from Test;

Execution Plan

----------------------------------------------------------

Plan Hash value:1357081020

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 100k| 19m| 336 (1) | 00:00:05 |

| 1 | TABLE ACCESS full| TEST | 100k| 19m| 336 (1) | 00:00:05 |

--------------------------------------------------------------------------

Note

-----

-Dynamic sampling used for this statement (level=2)

As can be seen from the above, the number of times optimizer estimates the number of rows of the table test is displayed as 100K, and we look at the following execution plan using dynamic sampling, the optimizer estimates how many rows:

3. Use dynamic sampling to view the execution plan (the following is a direct query, because dynamic sampling is enabled by default in 11G)

Sql> select * from test;

Execution Plan

----------------------------------------------------------

Plan Hash value:1357081020

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 72258 | 14m| 336 (1) | 00:00:05 |

| 1 | TABLE ACCESS full| TEST | 72258 | 14m| 336 (1) | 00:00:05 |

--------------------------------------------------------------------------

Note

-----

-Dynamic sampling used for this statement (level=2)

If dynamic sampling is enabled (by default, the dynamic sampling level is 2), the optimizer obtains some data information guesses based on dynamic sampling, and the estimated table test has a record line number of 86259, which is close to the actual record line number of 72258. It's much better than not doing dynamic sampling analysis.

If we increase the level of dynamic sampling to 3, as shown below, we find that the information obtained by the optimizer based on dynamic sampling is more accurate than the default (by default, the dynamic sampling level is 2). The optimizer estimates that the table test has a number of rows of 92364, which is closer to the actual step than 72258.

Sql> Select/*+ dynamic_sampling (Test 3) */* from Test;

Execution Plan

----------------------------------------------------------

Plan Hash value:1357081020

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 92364 | 18m| 336 (1) | 00:00:05 |

| 1 | TABLE ACCESS full| TEST | 92364 | 18m| 336 (1) | 00:00:05 |

--------------------------------------------------------------------------

Note

-----

-Dynamic sampling used for this statement (level=2)

4. As noted in this article by Master Tom, in the absence of dynamic sampling, if the table data is deleted, the CBO optimizer estimates that the result set is the same as it did before it was deleted.

This is because when the data of a table is deleted, the extent and blocks allocated by this table are not automatically reclaimed (the high watermark does not change), so if the CBO does not have the sample data block for analysis, but only obtains the information from the data dictionary extend, it will be mistaken that there is still so much data. Let's clear the test table data to see how the execution plan

sql> Delete from test;

86259 rows deleted.

Sql> commit;

Sql> Select/*+ dynamic_sampling (Test 0) */* from Test; ----Do not use dynamic sampling

Execution Plan

----------------------------------------------------------

Plan Hash value:1357081020

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 100k| 19m| 336 (1) | 00:00:05 |

| 1 | TABLE ACCESS full| TEST | 100k| 19m| 336 (1) | 00:00:05 |

--------------------------------------------------------------------------

Sql> select * from test; -----Using Dynamic sampling

Execution Plan

----------------------------------------------------------

Plan Hash value:1357081020

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 207 | 335 (0) | 00:00:05 |

| 1 | TABLE ACCESS full| TEST | 1 | 207 | 335 (0) | 00:00:05 |

--------------------------------------------------------------------------

Note

-----

-Dynamic sampling used for this statement (level=2)

It can be seen from the above view that the difference between dynamic sampling and dynamic sampling is not used;

5, we collect statistics on the test table: Again query, the table will be less execution plan: dynamic sampling

Sql> select * from test;

Execution Plan

----------------------------------------------------------

Plan Hash value:1357081020

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 207 | 335 (0) | 00:00:05 |

| 1 | TABLE ACCESS full| TEST | 1 | 207 | 335 (0) | 00:00:05 |

--------------------------------------------------------------------------

Sql>

The second case: when the table test is parsed, if the query script contains temporary tables, dynamic sampling techniques are used. Because the temp table is not parsed, it is not statistically informative. As shown below:

sql> drop table test;

Sql> CREATE TABLE Test as SELECT * from Dba_objects;

sql> exec dbms_stats.gather_table_stats (ownname = ' SYS ', tabname = ' TEST ', cascade=>true);

Sql> Create global temporary table tmp (object_type varchar2 (19));

sql> INSERT INTO TMP SELECT DISTINCT object_type from dba_objects;

The rows created.

Sql> commit;

Then look at the execution plan for the following query statement:

Sql> Select T.owner,l.object_type from Test t inner joins TMP L on T.object_type=l.object_type;

Execution Plan

----------------------------------------------------------

Plan Hash value:19574435

---------------------------------------------------------------------------

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 26 | 338 (1) | 00:00:05 |

|* 1 | HASH JOIN | | 1 | 26 | 338 (1) | 00:00:05 |

| 2 | TABLE ACCESS full| TMP | 1 | 11 | 2 (0) | 00:00:01 |

| 3 | TABLE ACCESS full| TEST | 86260 | 1263k| 336 (1) | 00:00:05 |

---------------------------------------------------------------------------

predicate information (identified by Operation ID):

---------------------------------------------------

1-access ("T". " Object_type "=" L "." Object_type ")

Note

-----

-Dynamic sampling used for this statement (level=2)

Sql>

As you can see from the above, the full table scan is performed on the TMP table, but the optimizer estimates only 1 rows of data

6. Dynamic sampling also has a unique ability to make statistics on the correlations between different columns.

The table statistics are relatively independent. When the query involves the correlation between columns, the statistics are somewhat inadequate, see the example of Master Tom

6.1. Create a special table T, and then create an index t_idx on the field Flag1, Flag2, and then analyze the collection statistics

Sql> CREATE TABLE T as select Decode (mod (rownum,2), 0, ' N ', ' Y ') Flag1, decode (mod (rownum,2), 0, ' Y ', ' N ') Flag2, a.* from All_objects A;

Sql> CREATE index T_idx on t (Flag1, FLAG2);

Sql> begin

Dbms_stats.gather_table_stats (user, ' T ',

Method_opt = ' For all indexed columns size 254 ');

End

PL/SQL procedure successfully completed.

6.2. View the number of rows in the table:

Sql> Select Num_rows, NUM_ROWS/2, NUM_ROWS/2/2 from User_tables where table_name= ' T ';

Num_rows NUM_ROWS/2 NUM_ROWS/2/2

---------- ---------- ------------

84396 42198 21099

6.3. Take a look at the execution plan of the SQL statement for FLAG1 filter conditions:

Sql> select * from t where flag1= ' N ';

Execution Plan

----------------------------------------------------------

Plan Hash value:1601196873

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 42937 | 4276k| 342 (1) | 00:00:05 |

|* 1 | TABLE ACCESS full| T | 42937 | 4276k| 342 (1) | 00:00:05 |

--------------------------------------------------------------------------

predicate information (identified by Operation ID):

---------------------------------------------------

1-filter ("FLAG1" = ' N ')

As you can see from the execution plan above: the CBO optimizer guesses and estimates the number of rows 42937, which is quite close to 42198 records.

6.4. Take a look at the execution plan of the SQL statement for FLAG2 filter conditions:

Sql> select * from t where flag2= ' N ';

Execution Plan

----------------------------------------------------------

Plan Hash value:1601196873

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 41459 | 4129k| 342 (1) | 00:00:05 |

|* 1 | TABLE ACCESS full| T | 41459 | 4129k| 342 (1) | 00:00:05 |

--------------------------------------------------------------------------

predicate information (identified by Operation ID):

---------------------------------------------------

1-filter ("FLAG2" = ' N ')

As you can see from the execution plan above: the CBO optimizer guesses and estimates the number of rows 41459, which is quite close to 42198 records.

6.5, if the condition Flag1 = ' n ' and flag2 = ' n ', we judge such a record according to logical reasoning is certainly nonexistent, this is also painstakingly constructs this special case the original intention. Let's see how the CBO optimizer detects and predicts

Sql> select * from t where Flag1 = ' n ' and flag2 = ' n ';

Execution Plan

----------------------------------------------------------

Plan Hash value:1601196873

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 21093 | 2101k| 342 (1) | 00:00:05 |

|* 1 | TABLE ACCESS full| T | 21093 | 2101k| 342 (1) | 00:00:05 |

--------------------------------------------------------------------------

predicate information (identified by Operation ID):

---------------------------------------------------

1-filter ("FLAG2" = ' n ' and "FLAG1" = ' n ')

From the above: the CBO estimates that the number of records is 12468, and the actual situation is very far. This is actually the CBO optimizer's estimate:

Flag1= ' N ' records accounted for 1/2 of total

Flag2= ' N ' records accounted for 1/2 of total

6.6, according to NUM_ROWS/2/2 =12468. This is obviously unreasonable. Let's see if dynamic sampling can avoid CBO errors by raising the dynamic sampling level:

Sql> Select/*+ dynamic_sampling (t 3) */* from t where Flag1 = ' n ' and flag2 = ' n ';

Execution Plan

----------------------------------------------------------

Plan Hash value:470836197

-------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 6 | 612 | 2 (0) | 00:00:01 |

| 1 | TABLE ACCESS by INDEX rowid| T | 6 | 612 | 2 (0) | 00:00:01 |

|* 2 | INDEX RANGE SCAN | T_idx | 6 | | 1 (0) | 00:00:01 |

-------------------------------------------------------------------------------------

predicate information (identified by Operation ID):

---------------------------------------------------

2-access ("FLAG1" = ' n ' and "FLAG2" = ' n ')

Note

-----

-Dynamic sampling used for this statement (level=2)

Attention:

①: The higher the sampling level, the more data chunks are sampled, the closer the analysis data is to the real, but the overhead associated with resource consumption increases. This is something that needs to be weighed. The default sampling level for ORACLE & 11g is 2, which is typically used to modify the dynamic sampling level using the dynamic_sampling hint in the session.

②: There are pros and cons, and dynamic sampling is not an artifact. The more data blocks it samples, the greater the overhead, which increases the time for SQL hard parsing, if it is a database warehouse (DW, OLAP) environment, SQL execution time is very long, hard parse time only a small portion of the entire SQL execution time, then you can improve the dynamic sampling level, This is helpful for the optimizer to get more accurate information. The general setting is 3 or 4 more appropriate.

③: In a heavily concurrent OLTP system, there are thousands of SQL statements executed per second that require short SQL statements and shorter execution times, so you should reduce the dynamic sampling level or do not use dynamic sampling in an OLTP system.

This article is from the "stupid Child's DBA path" blog, please be sure to keep this source http://fengfeng688.blog.51cto.com/4896812/1920184

Oracle Dynamic Sampling Learning

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Oracle Dynamic Sampling Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Oracle Dynamic Sampling Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support