About dynamic sampling (sampling)

Source: Internet
Author: User

about dynamic sampling (sampling)

Original: http://www.oracle.com/technetwork/issue-archive/2009/09-jan/o19asktom-086775.html


This article will answer: What is dynamic sampling? What is the role of dynamic sampling? And the meaning of different levels of dynamic sampling?


1. What is dynamic sampling?
Dynamic sampling was introduced from Oracle 9i version 2nd. It enables the optimizer (CBO) to sample an unresolved table during hard parsing
(any table, which has been created and loaded but not yet analyzed), which determines the table's default statistics, and can validate the optimizer's conjecture.
Named dynamic sampling because it dynamically generates better statistics for the optimizer during query hard parsing.


Dynamic sampling provides 11 levels of settings. Note: The default value in 9i is 1 to 10g and the default value is 2


2. How does dynamic sampling work?
There are two ways to use it:
Set the optimizer_dynamic_sampling parameter so that dynamic sampling can be set at the instance and session level.
Using dynamic_sampling hint


Take a look at the days when you don't use dynamic sampling.
CREATE TABLE T
As
Select owner, object_type
From All_objects
/




Select COUNT (*) from T;


COUNT (*)
------------------------
68076


Code1: Disable dynamic sampling to observe the default cardinality


Set Autotrace traceonly explain
Sql> Select/*+ dynamic_sampling (t 0) */* from t;


Execution Plan
------------------------------
Plan Hash value:1601196873


--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------
| 0 |      SELECT STATEMENT | |   16010 |    437k| 55 (0) | 00:00:01 |
|  1 | TABLE ACCESS full| T |   16010 |    437k| 55 (0) | 00:00:01 |
--------------------------------------------------------------------------


--Note that class 0 disables dynamic sampling, and the environment by default is to turn on dynamic sampling


Execution plan display cardinality: 16010 is far below 68076 of the above query, obviously not reliable.


Code2: More approximate cardinality of the display
select * from T;


Execution Plan
------------------------------
Plan Hash value:1601196873


--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------
| 0 |      SELECT STATEMENT | |  77871 |    2129k| 56 (2) | 00:00:01 |
|  1 | TABLE ACCESS full| T |  77871 |    2129k| 56 (2) | 00:00:01 |
--------------------------------------------------------------------------


Note
------------------------------------------
-Dynamic sampling used for this statement




Code3: The overvalued cardinality
Sql> Delete from T;
68076 rows deleted.


Sql> commit;
Commit complete.


Sql> Set Autotrace traceonly explain
Sql> Select/*+ dynamic_sampling (t 0) */* from t;


Execution Plan
------------------------------
Plan Hash value:1601196873


--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------
| 0 |      SELECT STATEMENT | |   16010 |    437k| 55 (0) | 00:00:01 |
|  1 | TABLE ACCESS full| T |   16010 |    437k| 55 (0) | 00:00:01 |
--------------------------------------------------------------------------


Sql> select * from T;


Execution Plan
-----------------------------
Plan Hash value:1601196873


--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------
| 0 |      SELECT STATEMENT |     |    1 |    28 | 55 (0) | 00:00:01 |
|  1 | TABLE ACCESS full|     T |    1 |    28 | 5 (0) | 00:00:01 |
--------------------------------------------------------------------------


Note
---------------------------------------
-Dynamic sampling used for this statement




3. When does dynamic sampling help the optimizer to verify its guesses?
We know that when the table information is collected using Dbms_stats, the optimizer gets the following statistics:
1) Table, number of rows, average line width, etc.;
2) Individual column, high and low value, unique value number, histogram (possible), etc.
3) Individual index, aggregation factor, number of leaf blocks, index height, etc.


Note, however, that some key statistics are missing in this area, such as the association between different column data in the table!
Suppose you have a global census table!
One property is: birth month month_born_in, another attribute is: the constellation Zodiac_sign. After collecting the information, you ask the optimizer how many people were born in November?
Assuming that the 12-month number is normally distributed, the optimizer quickly gives an answer of 1/12 of the total data! One more question: what is the constellation of Pisces? The answer is 1/12!
Optimizer fluent so far!!! Nice work!
But the 3rd question is: What is the number of people born in November and the constellation Pisces?
Discerning eye The answer is 0 (Pisces February 19 – March 20)! But we look at the optimizer's answer: 1/12/12!!! What a whimsical answer, mind-fixed! This will lead to the birth of a poor execution plan,
It is also at this point that our dynamic sampling begins to intervene:


CODE4: Creating Simulation data
SQL > CREATE TABLE t
As Select Decode (mod (rownum,2), 0, ' N ', ' Y ') Flag1,
Decode (mod (rownum,2), 0, ' Y ', ' N ') Flag2, a.*
From All_objects A
/
Table created.


SQL > CREATE index t_idx on T (FLAG1,FLAG2);
Index created.


SQL > Begin
Dbms_stats.gather_table_stats
(User, ' T ',
Method_opt=> ' For all indexed columns size 254 ');
End
/
PL/SQL procedure successfully completed.


Sql> Select Num_rows, NUM_ROWS/2,
NUM_ROWS/2/2 from User_tables
WHERE table_name = ' T ';


Num_rows NUM_ROWS/2 NUM_ROWS/2/2
--------  ----------  ------------
68076 34038 17019


CODE5: Verify the above statement:
Sql> Set Autotrace traceonly explain
Sql> select * from t where flag1= ' N ';


Execution Plan
------------------------------
Plan Hash value:1601196873


--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------
| 0 |      SELECT STATEMENT | |  33479 |   3432k| 292 (1) | 00:00:04 |
|* 1 | TABLE ACCESS full| T |  33479 |   3432k| 292 (1) | 00:00:04 |
--------------------------------------------------------------------------


predicate information (identified by Operation ID):
---------------------------------------------------
1-filter ("FLAG1" = ' N ')


Sql> select * from t where flag2= ' N ';


Execution Plan
----------------------------
Plan Hash value:1601196873


---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
---------------------------------------------------------------------------
| 0 |      SELECT STATEMENT | |   34597 |   3547k| 292 (1) | 00:00:04 |
|* 1 | TABLE ACCESS full| T |   34597 |   3547k| 292 (1) | 00:00:04 |
---------------------------------------------------------------------------


predicate information (identified by Operation ID):
---------------------------------------------------


1-filter ("FLAG2" = ' N ')


--Everything's fine! So far, so good!


Code5:here comes the problem
Sql> select * from t where Flag1 = ' n ' and flag2 = ' n ';


Execution Plan
----------------------------
Plan Hash value:1601196873


--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
--------------------------------------------------------------------------
| 0 |      SELECT STATEMENT | |  17014 |   1744k| 292 (1) | 00:00:04 |
|* 1 | TABLE ACCESS full| T |  17014 |   1744k| 292 (1) | 00:00:04 |
--------------------------------------------------------------------------


predicate information (identified by Operation ID):
----------------------------------------------------


1-filter ("FLAG1" = ' n ' and "FLAG2" = ' n ')


--Verifying that the optimizer we said earlier is whimsical at this point.


Code7: Dynamic sampling hearing, getting involved
Sql> Select/*+ dynamic_sampling (t 3) */* from t where Flag1 = ' n ' and flag2 = ' n ';


Execution Plan
-----------------------------
Plan Hash value:470836197


------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
------------------------------------------------------------------------------------
| 0 |       SELECT STATEMENT |    |   6 |     630 | 2 (0) | 00:00:01 |
|  1 | TABLE ACCESS by INDEX rowid|    T |   6 |     630 | 2 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN |    T_idx |       6 |     | 1 (0) | 00:00:01 |
------------------------------------------------------------------------------------


predicate information (identified by Operation ID):
----------------------------------------------------


2-access ("FLAG1" = ' n ' and "FLAG2" = ' n ')


Code8: We open sql_trace and we see the following statement:
SELECT/* Opt_dyn_samp *//*+ all_rows Ignore_where_clause
No_parallel (samplesub) opt_param (' parallel_execution_enabled ', ' false ')
No_parallel_index (samplesub) No_sql_tune */NVL (SUM (C1),: "Sys_b_00"),
NVL (sum (C2),: "sys_b_01"), NVL (sum (C3),: "Sys_b_02")
From
(Select/*+ ignore_where_clause no_parallel ("T") Full ("T")
No_parallel_index ("T") */: "sys_b_03" as C1, Case "T". " FLAG1 "=
: "Sys_b_04" and "T". " FLAG2 "=:" Sys_b_05 "then:" Sys_b_06 "ELSE:" sys_b_07 "
END as C2, case "T". " FLAG2 "=:" sys_b_08 "and" T "." FLAG1 "=:" sys_b_09 "
Then: "Sys_b_10" ELSE: "Sys_b_11" END as C3 from "T" SAMPLE BLOCK
(: "Sys_b_12",: "Sys_b_13") SEED (: "sys_b_14") "T") samplesub

It can be seen that the optimizer is validating its conjecture ...


4. Dynamic Sampling level:
11 levels are listed now, please refer to the following for details: http://docs.oracle.com/cd/B19306_01/server.102/b14211/stats.htm#i43032
1) level 0:do isn't use dynamic sampling.
Level 0: Do not use dynamic sampling.


2) Level 1:sample All tables that has not been analyzed if the following criteria was met: (1) There is at least 1 unanal yzed table in the query; (2) This unanalyzed table was joined to another table or appears in a subquery or non-mergeable view; (3) This unanalyzed table has no indexes; (4) This unanalyzed table have more blocks than the number of blocks that would is used for dynamic sampling of this table. The number of blocks sampled is the default number of the dynamic sampling blocks (32).
Level 1: All tables that have not been analyzed are sampled if the following conditions are met:
(1) There is at least one non-analytic table in the query;
(2) This non-analytic table is associated with another table or appears in a subquery or a non-merge view;
(3) This non-analytical table has an index;
(4) This non-analytic table has excess dynamic sampling default number of blocks (default is 32 blocks).


3) level 2:apply dynamic sampling to all unanalyzed tables. The number of blocks sampled is both times the default number of the dynamic sampling blocks.
Level 2: Dynamic sampling of all the non-parsed tables. The number of sampled data blocks is twice times the default number.


4) level 3:apply dynamic sampling to all tables this meet level 2 criteria, plus all tables for which standard selectivity Estimation used a guess for some predicate so is a potential dynamic sampling predicate. The number of blocks sampled is the default number of the dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is the times the default number of the dynamic sampling blocks.
Level 3: The number of sampled data blocks equals the default number, plus those using the conjecture selection elimination table on a level 2 basis. For non-parsed tables, the number of samples is twice times the default number.




5) level 4:apply dynamic sampling to all tables this meet level 3 criteria, plus all tables that has single-table Predica TES that reference 2 or more columns. The number of blocks sampled is the default number of the dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is the times the default number of the dynamic sampling blocks.
Level 4: The number of sampled blocks equals the default number, plus those with the single table predicate associated with 2 or more columns on a level 3 basis. For non-parsed tables, the number of samples is twice times the default number.




6) Levels 5, 6, 7, 8, and 9:apply dynamic sampling to all tables this meet the previous level criteria using 2, 4, 8, 32, or times the default number of the dynamic sampling blocks respectively.
The 5,6,7,8,9 level uses 2,4,8,32,128 times the default number of dynamically sampled data blocks on a level 4 basis, respectively.


7) level 10:apply dynamic sampling to all tables, meet the Level 9 criteria using all blocks in the table.
Level 10: All data blocks in the table are sampled on a level 9 basis.


5. When is dynamic sampling suitable?


This is a tricky question, not a certain use of experience, but really embarrassed to say.
Usually:
1) We use the 3 and 4 levels for dynamic sampling.
2) If the parsing time of our SQL is very fast but the execution time is very slow, we can consider using dynamic sampling. A typical data warehousing system.
3) The OLTP system is a SQL repeat, parsing and execution are wink, so high-level dynamic sampling is not recommended. This will result in hard parsing consumption of SQL.
This time you can consider SQL profile, which you can interpret as "static sampling."


About SQL Profiles reference: http://docs.oracle.com/cd/B28359_01/server.111/b28274/sql_tune.htm#PFGRF02605


-------------------------------------

Dylan presents.












About dynamic sampling (sampling)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.