Extended Optimizer Statistics in Oracle 11g Improve Performa

來源:互聯網
上載者:User

oracle最佳化器會利用統計資訊來選擇最優或者次優的執行計畫,在確定執行計畫的過程中,oracle可以非常很好的分析每個表列的資料分布特徵,但是對於列與列之間的某些關聯關係,oracle是無能為力的,例如城市名稱和郵編之間的對應關係。值得慶幸的是,在oracle11g中,引入了基於列組合和關聯運算式的統計分析功能,這無疑為最佳化器做出更正確的抉擇提供了有力的支援。我們稱擴充統計資訊功能。

對於擴充統計分析功能,我們大概可以分為兩類:基於列組合基於運算式。而實現的方法也存在兩種:通過DBMS_STATS.CREATE_EXTENDED_STATS通過method_opt參數

DBMS_STATS.CREATE_EXTENDED_STATS和METHOD_OPT

建立擴充的統計資訊後,我們可以在DBA_STAT_EXTENSIONS和dba_tab_col_statistics視圖中查詢到相關資訊。樣本如下:

首先看一下METHOD_OPT方法

SQL> show userUSER 為 "SH"SQL> exec dbms_stats.delete_table_stats('SH','CUSTOMERS');PL/SQL 過程已成功完成。SQL> exec  Dbms_Stats.drop_Extended_Stats('SH','CUSTOMERS','(cust_state_province,country_id)');PL/SQL 過程已成功完成。SQL> Select * From User_Stat_Extensions;未選定行SQL> select * from user_tab_col_statistics where table_name='CUSTOMERS';未選定行SQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1,for columns (cust_state_province,country_id) size auto ');PL/SQL 過程已成功完成。SQL> select table_name,column_name from user_tab_col_statistics where table_name='CUSTOMERS';TABLE_NAME       COLUMN_NAME------------------------------ ------------------------------CUSTOMERS       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_CUSTOMERS       CUST_IDCUSTOMERS       CUST_FIRST_NAMECUSTOMERS       CUST_LAST_NAMECUSTOMERS       CUST_GENDERCUSTOMERS       CUST_YEAR_OF_BIRTHCUSTOMERS       CUST_MARITAL_STATUSCUSTOMERS       CUST_STREET_ADDRESSCUSTOMERS       CUST_POSTAL_CODECUSTOMERS       CUST_CITYCUSTOMERS       CUST_CITY_IDTABLE_NAME       COLUMN_NAME------------------------------ ------------------------------CUSTOMERS       CUST_STATE_PROVINCECUSTOMERS       CUST_STATE_PROVINCE_IDCUSTOMERS       COUNTRY_IDCUSTOMERS       CUST_MAIN_PHONE_NUMBERCUSTOMERS       CUST_INCOME_LEVELCUSTOMERS       CUST_CREDIT_LIMITCUSTOMERS       CUST_EMAILCUSTOMERS       CUST_TOTALCUSTOMERS       CUST_TOTAL_IDCUSTOMERS       CUST_SRC_IDCUSTOMERS       CUST_EFF_FROMTABLE_NAME       COLUMN_NAME------------------------------ ------------------------------CUSTOMERS       CUST_EFF_TOCUSTOMERS       CUST_VALID已選擇24行。SQL> Select * From User_Stat_Extensions;TABLE_NAME       EXTENSION_NAME------------------------------ ------------------------------EXTENSION CREATO DRO-------------------------------------------------------------------------------- ------ ---CUSTOMERS       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_("CUST_STATE_PROVINCE","COUNTRY_ID") USERYES

使用method_opt會產生擴充列並直接統計擴充資訊

再來看看DBMS_STATS.CREATE_EXTENDED_sTATS

SQL> exec dbms_stats.delete_table_stats('SH','CUSTOMERS');PL/SQL 過程已成功完成。SQL> exec  Dbms_Stats.drop_Extended_Stats('SH','CUSTOMERS','(cust_state_province,country_id)');PL/SQL 過程已成功完成。SQL> Select * From User_Stat_Extensions;未選定行SQL> select table_name,column_name from user_tab_col_statistics where table_name='CUSTOMERS';未選定行SQL> SELECT DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(cust_state_province,country_id)') FROM DUAL; DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(CUST_STATE_PROVINCE,COUNTRY_ID)')--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SYS_STU#S#WF25Z#QAHIHE#MOFFMM_SQL> select table_name,column_name from user_tab_col_statistics where table_name='CUSTOMERS';未選定行SQL> Select * From User_Stat_Extensions;TABLE_NAME       EXTENSION_NAME      EXTENSION           CREATO DRO------------------------------ ------------------------------ -------------------------------------------------------------------------------- ------ ---CUSTOMERS       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE","COUNTRY_ID")          USER   YESSQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1 ');PL/SQL 過程已成功完成。SQL> select table_name,column_name from user_tab_col_statistics where table_name='CUSTOMERS';TABLE_NAME       COLUMN_NAME------------------------------ ------------------------------CUSTOMERS       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_CUSTOMERS       CUST_IDCUSTOMERS       CUST_FIRST_NAMECUSTOMERS       CUST_LAST_NAMECUSTOMERS       CUST_GENDERCUSTOMERS       CUST_YEAR_OF_BIRTHCUSTOMERS       CUST_MARITAL_STATUSCUSTOMERS       CUST_STREET_ADDRESSCUSTOMERS       CUST_POSTAL_CODECUSTOMERS       CUST_CITYCUSTOMERS       CUST_CITY_IDTABLE_NAME       COLUMN_NAME------------------------------ ------------------------------CUSTOMERS       CUST_STATE_PROVINCECUSTOMERS       CUST_STATE_PROVINCE_IDCUSTOMERS       COUNTRY_IDCUSTOMERS       CUST_MAIN_PHONE_NUMBERCUSTOMERS       CUST_INCOME_LEVELCUSTOMERS       CUST_CREDIT_LIMITCUSTOMERS       CUST_EMAILCUSTOMERS       CUST_TOTALCUSTOMERS       CUST_TOTAL_IDCUSTOMERS       CUST_SRC_IDCUSTOMERS       CUST_EFF_FROMTABLE_NAME       COLUMN_NAME------------------------------ ------------------------------CUSTOMERS       CUST_EFF_TOCUSTOMERS       CUST_VALID已選擇24行。

可以看出,DBMS_STATS.CREATE_EXTEND_sTATS不會直接建立統計資訊,需要我們手工調用DBMS_STATS.GATHER_TABLE_STATS過程。


EXPRESSION

除了可以按照列組合的方式建立擴充統計資訊,還可以按照運算式的方式來建立擴充統計資訊,這對於基於函數的索引的非常有用的。

SQL> select dbms_Stats.create_extended_Stats('SH','CUSTOMERS','(MOD(CUST_ID,10))') FROM DUAL;DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(MOD(CUST_ID,10))')----------------------------------------------------------------------------------------------------SYS_STU1D2S2K6$TFSJ$24PUR2SN1ESQL> Select * From User_Stat_Extensions;TABLE_NAME       EXTENSION_NAME------------------------------ ------------------------------EXTENSION CREATO DRO-------------------------------------------------------------------------------- ------ ---CUSTOMERS       SYS_STU1D2S2K6$TFSJ$24PUR2SN1E(MOD("CUST_ID",10)) USERYES


擴充統計是如何協助最佳化器的?

我們通過下面的例子來示範一下擴充統計資訊是如何協助最佳化器正確評估sql語句的選擇性的,

SQL> Select * From User_Stat_Extensions;   --沒有開啟擴充統計資訊未選定行SQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1 ,for columns cust_state_province size 250, for columns country_id size 250');PL/SQL 過程已成功完成。SQL> set autotrace on explainSQL> Select Count(*) From Customers Where Cust_State_Province='CA';  COUNT(*)----------      3341   --實際存在數量為3341執行計畫----------------------------------------------------------Plan hash value: 296924608--------------------------------------------------------------------------------| Id  | Operation   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |       |     1 |    11 |   413 (1)| 00:00:05 ||   1 |  SORT AGGREGATE    |       |     1 |    11 |    |       ||*  2 |   TABLE ACCESS FULL| CUSTOMERS |  3359 | 36949 |   413 (1)| 00:00:05 |   --最佳化器估計的數量為3359,與3341基本接近--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("CUST_STATE_PROVINCE"='CA')SQL> Select Count(*) From Customers Where Country_Id=52790;  COUNT(*)----------     18520  --實際存在數量為 18520執行計畫----------------------------------------------------------Plan hash value: 296924608--------------------------------------------------------------------------------| Id  | Operation   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |       |     1 |     5 |   413 (1)| 00:00:05 ||   1 |  SORT AGGREGATE    |       |     1 |     5 |    |       | |*  2 |   TABLE ACCESS FULL| CUSTOMERS | 18863 | 94315 |   413 (1)| 00:00:05 |   --最佳化器估計數量為18863,與18520基本接近--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("COUNTRY_ID"=52790)SQL> SELECT COUNT(*) FROM CUSTOMERS WHERE CUST_STATE_PROVINCE='CA' AND COUNTRY_ID=52790;  COUNT(*)----------      3341  --實際值數量為3341,這是因為country_id和CUST_STAT_PROVINCE之間存在某種數值關係,而這種關係,oracle資料庫是不知的執行計畫----------------------------------------------------------Plan hash value: 296924608--------------------------------------------------------------------------------| Id  | Operation   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |       |     1 |    16 |   413 (1)| 00:00:05 ||   1 |  SORT AGGREGATE    |       |     1 |    16 |    |       ||*  2 |   TABLE ACCESS FULL| CUSTOMERS |  1142 | 18272 |   413 (1)| 00:00:05 | 
 --由於oracle不知道--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)SQL> Select Count(*) From Customers Where Country_Id=52770;  COUNT(*)----------      7780執行計畫----------------------------------------------------------Plan hash value: 296924608--------------------------------------------------------------------------------| Id  | Operation   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |       |     1 |     5 |   413 (1)| 00:00:05 ||   1 |  SORT AGGREGATE    |       |     1 |     5 |    |       ||*  2 |   TABLE ACCESS FULL| CUSTOMERS |  7380 | 36900 |   413 (1)| 00:00:05 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("COUNTRY_ID"=52770)SQL> SELECT COUNT(*) FROM CUSTOMERS WHERE CUST_STATE_PROVINCE='CA' AND COUNTRY_ID=52770;  COUNT(*)---------- 0  --實際值為0執行計畫----------------------------------------------------------Plan hash value: 296924608--------------------------------------------------------------------------------| Id  | Operation   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |       |     1 |    16 |   413 (1)| 00:00:05 ||   1 |  SORT AGGREGATE    |       |     1 |    16 |    |       ||*  2 |   TABLE ACCESS FULL| CUSTOMERS |   447 |  7152 |   413 (1)| 00:00:05 |--------------------------------------------------------------------------------
--由於oracle不知道

Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52770)

可以看出,如果oracle不瞭解列country_id和CUST_STAT_PROVINCE之間的數量關係,最佳化器估計的結果和實際值之間是存在差距的,下面看一下添加擴充統計資訊後的結果

SQL> SELECT DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(cust_state_province,country_id)') FROM DUAL; DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(CUST_STATE_PROVINCE,COUNTRY_ID)')----------------------------------------------------------------------------------------------------SYS_STU#S#WF25Z#QAHIHE#MOFFMM_SQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1 ,for columns cust_state_province size 250, for columns country_id size 250');PL/SQL 過程已成功完成。SQL> select table_name,column_name,histogram from user_tab_col_statistics where table_name='CUSTOMERS' AND column_name like 'SYS%';TABLE_NAME       COLUMN_NAME      HISTOGRAM------------------------------ ------------------------------ ---------------CUSTOMERS       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ NONESQL> set autotrace on explainSQL> SELECT COUNT(*) FROM CUSTOMERS WHERE CUST_STATE_PROVINCE='CA' AND COUNTRY_ID=52770;  COUNT(*)---------- 0執行計畫----------------------------------------------------------Plan hash value: 296924608--------------------------------------------------------------------------------| Id  | Operation   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |       |     1 |    16 |   413 (1)| 00:00:05 ||   1 |  SORT AGGREGATE    |       |     1 |    16 |    |       ||*  2 |   TABLE ACCESS FULL| CUSTOMERS |   432 |  6912 |   413 (1)| 00:00:05 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52770)SQL> set autotrace offSQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE auto ,for columns cust_state_province size 250, for columns country_id size 250');PL/SQL 過程已成功完成。SQL> select table_name,column_name,histogram from user_tab_col_statistics where table_name='CUSTOMERS' AND column_name like 'SYS%';TABLE_NAME       COLUMN_NAME      HISTOGRAM------------------------------ ------------------------------ ---------------CUSTOMERS       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ FREQUENCYSQL> set autotrace on explainSQL> SELECT COUNT(*) FROM CUSTOMERS WHERE CUST_STATE_PROVINCE='CA' AND COUNTRY_ID=52770;  COUNT(*)---------- 0執行計畫----------------------------------------------------------Plan hash value: 296924608--------------------------------------------------------------------------------| Id  | Operation   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------|   0 | SELECT STATEMENT   |       |     1 |    16 |   413 (1)| 00:00:05 ||   1 |  SORT AGGREGATE    |       |     1 |    16 |    |       ||*  2 |   TABLE ACCESS FULL| CUSTOMERS |     5 |    80 |   413 (1)| 00:00:05 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52770)

擴充的統計資訊發揮作用,協助最佳化器做出了正確的評估。


Extended Statistics Usage Notes:  

The Oracle documentation notes these limitations on the dbms_stats.create_extended_stats extension argument:

- The extension cannot contain a virtual column.

- Extensions cannot be created on tables owned by SYS.

- Extensions cannot be created on cluster tables, index organized tables, temporary tables or external tables.

- The total number of extensions in a table cannot be greater than a maximum of (20, 10% of number of non-virtual columns in the table).

- The number of columns in a column group must be in the range [2, 32].

- A column cannot appear more than once in a column group.

- A column group can not contain expressions.

- An expression must contain at least one column.

- An expression cannot contain a subquery.

- The COMPATIBLE parameter needs to be 11.0.0.0.0 or greater


相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.