Oracle histogram, Histogram

Source: Internet
Author: User

Oracle histogram, Histogram

Histogram when the data distribution of a column is uneven, in order to allow CBO to generate the best execution plan, we may need to collect a histogram for the table. The maximum number of buckets in the histogram is 254. It is a very time-consuming process to collect histograms. Do not collect histograms unless necessary. Oracle has two kinds of histograms: frequency histogram. When the number of Distinct_keys in the column is small (less than 254), if you do not manually specify the number of HISTOGRAM buckets ), oracle automatically creates a frequency histogram, and the number of buckets is equal to Distinct_Keys. One is the height balanced histogram. When Distinct_keys in the column is greater than 254, if you do not manually specify the number of histogram buckets (buckets), Oracle will automatically create a height balanced histogram. Under what circumstances is a histogram used? When the column value distribution is very uneven, and this column is often used in the where condition. Are histograms accurate? Not necessarily. If the number of distinct values in a field is very large and close to the number of distinct values in the primary key, no histogram is required, and the histogram is not necessarily 100% accurate. Related @ scripts are provided at the end of the article. SQL> drop table a; the table has been deleted. SQL> create table a as select * from dba_objects where rownum <= 10000; the table has been created. SQL> @ anatab -- Value of ownname input for regular table analysis: Value of tabname input for ggs: Value of estimate_percent input for a: 100 value of skewonly_repeat_auto input for degree: the 4PL/SQL process has been completed successfully. Used time: 00: 00: 00.26SQL> @ getcolstat -- input owner value for the histogram of the field: the value of table_name input by ggs: aCOLUMN_NAME NUM_ROWS CARDINALITY selecti1_histogram NUM_BUCKETS LAST_ANALYZED ---------------- ---------- ----------------- ----------------------- -------------- SECONDARY 10000 1. 01 NONE 1 28-7 month-14 GENERATED 10000 2. 02 NONE 1 28-7 month-14 TEMPORARY 10000 2. 02 NONE 1 28-7 month-14 STATUS 10000 1. 01 NONE 1 28-7 month-14 TIMESTAMP 10000 350 3.5 NONE 1 28-7 month-14 LAST_DDL_TIME 10000 385 3.85 NONE 1 28-7 month-14 CREATED 10000 303 3.03 NONE 1 28-7 month-14 OBJECT_TYPE 10000 34. 34 NONE 1 28-7 month-14 DATA_OBJECT_ID 10000 1836 18.36 NONE 1 28-7 month-14 OBJECT_ID 10000 10000 100 NONE 1 28-7 month-14 SUBOBJECT_NAME 10000 27. 27 NONE 1 28-7 month-14 OBJECT_NAME 10000 7725 77.25 NONE 1 28-7 month-14 OWNER 10000 9. 09 NONE 1 28-7 month-14 13 rows have been selected. SQL> select object_type, count (*) from a group by object_type; OBJECT_TYPE COUNT (*) ------------------- ---------- INDEX 946JOB CLASS 2 CONTEXT 2 type body 82 PROCEDURE 50 resource plan 3 RULE 1 SCHEDULE 1 table partition 52 WINDOW 2 window group 1 TABLE 841 TYPE 1088 VIEW 2953 LIBRARY 113 FUNCTION 68 TRIGGER 5 PROGRAM 3 CLUSTER 10 SYNONYM 2458 package body 470 QUEUE 21 consumer group 5 evaluation context 8 rule set 11DI RECTORY 2 UNDEFINED 6 OPERATOR 15 SEQUENCE 102LOB 128 PACKAGE 485JOB 6 index partition 59LOB PARTITION 1 34 rows selected. SQL> explain plan for select count (*) from a where object_type = 'index'; explained. SQL> @ getplan 'General, outline, starts 'enter value for plan type: generalPLAN_TABLE_OUTPUT partition -------- Plan hash value: 2223038180 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | ----------------- ---------------------------------------------------------- | 0 | select statement | 1 | 7 | 25 (0) | 00:00:01 | 1 | sort aggregate | 1 | 7 | * 2 | table access full | A | 294 | 2058 | 25 (0) | 00:00:01 | -- different from the actual situation. The above Information is identified as "identified by operation id ):---------------------------------------------- ----- 2-filter ("OBJECT_TYPE" = 'index') SQL> select 10000/34 from dual; -- 294 in rows is the estimate value = Total number of rows/number of distinct values in the field 10000/34 ---------- 1 row has been selected for 294.117647. SQL> @ anatab_col input owner value: ggs input table_name value: a input columns value: object_type -- histogram for the object_type field PL/SQL process has been completed successfully. SQL> explain plan for select count (*) from a where object_type = 'index'; explained. SQL> @ getplan 'General, outline, starts 'enter value for plan type: generalPLAN_TABLE_OUTPUT partition -------- Plan hash value: 2223038180 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | ----------------- ---------------------------------------------------------- | 0 | select statement | 1 | 7 | 25 (0) | 00:00:01 | 1 | sort aggregate | 1 | 7 | * 2 | table access full | A | 946 | 6622 | 25 (0) | 00:00:01 | -- returns the actual number of rows. After the histogram of the object_type field is created, the execution plan is very accurate. Protected Predicate Information (identified by operation id): required 2-filter ("OBJECT_TYPE" = 'index') SQL> @ getcolstat: value of table_name entered by ggs: aCOLUMN_NAME NUM_ROWS CARDINALITY selecti1_histogram NUM_BUCKETS LAST_ANALYZED ---------------------------------------- -------- --------- ----------- -------------- SECONDARY 10000 1. 01 NONE 1 28-7 month-14 GENERATED 10000 2. 02 NONE 1 28-7 month-14 TEMPORARY 10000 2. 02 NONE 1 28-7 month-14 STATUS 10000 1. 01 NONE 1 28-7 month-14 TIMESTAMP 10000 350 3.5 NONE 1 28-7 month-14 LAST_DDL_TIME 10000 385 3.85 NONE 1 28-7 month-14 CREATED 10000 303 3.03 NONE 1 28-7 month-14 OBJECT_TYPE 10000 34. 34 FREQUENCY 34 28-7 month-14 -- exactly equal to distinct value DATA_OBJECT _ ID 10000 1836 18.36 NONE 1 28-7 month-14 OBJECT_ID 10000 10000 100 NONE 1 28-7 month-14 SUBOBJECT_NAME 10000 27. 27 NONE 1 28-7 month-14 OBJECT_NAME 10000 7725 77.25 NONE 1 28-7 month-14 OWNER 10000 9. 09 NONE 1 28-7 month-14 13 rows have been selected. SQL> select count (distinct object_name) from a; -- there are only 10000 rows in total. We can see that the object_name selectivity is relatively high COUNT (DISTINCTOBJECT_NAME) -------------------- 7725 1 row has been selected. SQL> @ anatab_col input owner value: ggs input table_name value: a input columns value: object_namePL/SQL process completed successfully. SQL> @ getcolstat input owner value: ggs input table_name value: aCOLUMN_NAME NUM_ROWS CARDINALITY selecti1_histogram NUM_BUCKETS exceed ---------- ----------- begin SECONDARY 10000 1. 01 NONE 1 28-7 month-14 GENERATED 10000 2. 02 NONE 1 28-7 month-14 TEMPORARY 10000 2. 02 NONE 1 28-7 month-14 STATUS 10000 1. 01 NONE 1 28-7 month-14 TIMESTAMP 10000 350 3. 5 NONE 1 28-7 month-14 LAST_DDL_TIME 10000 385 3.85 NONE 1 28-7 month-14 CREATED 10000 303 3.03 NONE 1 28-7 month-14 OBJECT_TYPE 10000 34. 34 FREQUENCY 34 28-7 month-14 DATA_OBJECT_ID 10000 1836 18.36 NONE 1 28-7 month-14 OBJECT_ID 10000 10000 100 NONE 1 28-7 month-14 SUBOBJECT_NAME 10000 27. 27 NONE 1 28-7 month-14 OBJECT_NAME 10000 7725 77.25 height balanced 75 28-7 month-14 OWNER 10000 9. 09 NONE 1 28-7 month-14 13 rows have been selected. SQL> select count (*) from a where object_name like '% A %'; COUNT (*) ---------- 1 row selected in 6404. SQL> explain plan for select count (*) from a where object_name like '% A %'; explained. SQL> @ getplan 'General, outline, starts' Enter value for plan type: generalPLAN_TABLE_OUTPUT partition Plan hash value: 2223038180 Bytes | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | ------------------------------------------ ------------------------------- | 0 | select statement | 1 | 19 | 25 (0) | 00:00:01 | 1 | sort aggregate | 1 | 19 | * 2 | table access full | A | 500 | 9500 | 25 (0) | 00:00:01 | identified by operation id: --------------------------------------------------------------- 2-filter ("OBJECT_NAME" LIKE '% A % ') -- LIKE '% A %' is too complicated for cbo. without running it, cbo does not know how many rows are actually returned. SQL> 13 rows have been selected. SQL> col OBJECT_NAME for a30SQL> select OBJECT_NAME, count (*) from a group by OBJECT_NAME having count (*)> 3 order by count (*) desc; OBJECT_NAME COUNT (*) ------------------------------ ---------- DBMS_REPCAT_AUTH 5 has selected 1 line. SQL> explain plan for select count (*) from a where OBJECT_NAME = 'dbms _ REPCAT_AUTH '; explained. SQL> @ getplan 'General, outline, starts' Enter value for plan type: generalPLAN_TABLE_OUTPUT partition Plan hash value: 2223038180 Bytes | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | ----------------------------------------------- -------------------------- | 0 | select statement | 1 | 19 | 25 (0) | 00:00:01 | 1 | sort aggregate | 1 | 19 | * 2 | table access full | A | 1 | 19 | 25 (0) | 00:00:01 | identified by operation id: --------------------------------------------------------------- 2-filter ("OBJECT_NAME" = 'dbms _ REPCAT_AUTH ')- -This is not complicated. rows = 1 is not accurate, and the histogram cannot guarantee 100% accuracy. Therefore, not all fields are suitable for histograms. Distinct has many values and is not suitable for histograms at all. The default number of buckets cannot be installed. Only when the field value skew is very serious, there are few distinct values, and the where condition in the used SQL statement contains this field. If this field is not used in SQL, it is not necessary to do a histogram, because doing a histogram is very cpu performance. @ Script -- anatab. sqlset timing onBEGIN partition (ownname => '& ownname', tabname =>' & tabname', estimate_percent => & estimate_percent, method_opt => 'for all columns size & percent ', no_invalidate => FALSE, degree => ° ree, cascade => TRUE); END;/set timing off -- anatab_col.sqlBEGIN DBMS_STATS.GATHER_TABLE_STATS (ownname => '& owner ', tabname => '& table_name', estimate_percent => 100, method_opt => 'for columns & columns', -- such as: col1, col2, col3... no_invalidate => FALSE, degree => 4, granularity => 'all', cascade => TRUE); END;/-- getcolstat. sqlcol COLUMN_NAME for a30select. column_name, B. num_rows,. num_distinct Cardinality, round (. num_distinct/B. num_rows * 100, 2) selecti.pdf,. histogram,. num_buckets,. last_analyzed from dba_tab_col_statistics a, dba_tables B where. owner = B. owner and. table_name = B. table_name and. owner = upper ('& owner') and. table_name = upper ('& table_name'); -- getplan. sqlset feedback offpro 'General, outline, starts' proacc type prompt 'enter value for plan type: 'default' general 'select * from table (dbms_xplan.display) where '& type' = 'General'; select * from table (dbms_xplan.display (null, null, 'advanced-projection ') where' & type' = 'outline '; SELECT * from table (DBMS_XPLAN.DISPLAY_CURSOR (NULL, NULL, 'allstats LAST ') where' & type '= 'starts'; set feedback onundef type



Oracle select statement optimization problems? 5 points

1)
WHERE zone_id =: zoneId AND
Stat_time> = TO_DATE (: start, 'yyyy-MM-DD ') AND
Stat_time <TO_DATE (: endDate, 'yyyy-MM-DD ') + 1

2) create an appropriate index

If a single SQL, there is no other can be optimized, TO_CHAR (stat_time, 'yyyy-MM-DD '), this call 3 times, can you consider using stored procedures?

What is the difference between oracle 11g and 12c?

On OOW 2012, Tom kyte introduced 12 new features of Oracle's new generation heavyweight database product 12c. Currently, all major Open World 2012 PDF files can be downloaded. The portal is here: search Content Catalog for Oracle OpenWorld 2012 sessions.
Tom's 12 Things About The Latest Generation of Database Technology.

Here we will take a look at the 12 feature enhancements in Tom's eyes:

#1 Even better PL/SQL from SQL, directly embed and run PL/SQL objects in SQL, speculation may optimize the interaction between the SQL engine and the PL/SQL engine code engine to achieve less context switching than the traditional SQL call function.

#2 Improved Defaults enhances DEFAULT. Currently, default can directly refer to sequence and enhances the ability of default to act as identity.

Default to a sequence
Default when null inserted
Identity Type
Metadata-only Defaults for NULL columns

#3 Increased Size Limit for VARCHAR2, NVARCHAR2, and RAW Data Types
The data types Varchar2, NVarchar2 and Raw can be extended to 32 k, which is the same as the variable type in PL/SQL. OF course, too long may cause OUT OF LINE storage like LOB.

#4 Easy Top-N and pagination queries, more Easy-to-use Top-N and page number queries
Provides syntax similar to limit in MySQL, Row Limiting Clause

Fetch first 5 rows only; ==" ONLY the FIRST 5 ROWS of fetch
Fetch next 0.01 percent rows only; ONLY fetch 0.0.1% of the number of ROWS

#5 Row Pattern Matching enhanced Row Pattern Matching
A new pattern matching clause match_recognize is provided. You can use match_recognize to define the regular syntax.

#6 Partitioning Improvements enhanced partition features, including asynchronous maintenance of global index drop and truncate partition operations, and Interval + Reference partition methods
Asynchronous Global Index Maintenance for DROP and TRUNCATE
Partition
Cascade Functionality for TRUCATE and EXCHANGE partition
Multiple partition operations in a single DDL
Online move of a partition (without DBMS_REDEFINITION)
Interval + Reference partitioning

#7 Adaptive Execution Plans Adaptive Execution Plan. This feature is amazing. The final Execution plan will be based on the rows obtained during Execution, and the problems caused by column skew will be overcome.

#8 Enhanced Statistics and added 11th-level dynamic sampling. For parallel query... the remaining full text>

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.