關於oracle讀取資料時,自動進行HASH處理的隱含參數:_gby_hash_aggregation_enabled,

來源:互聯網
上載者:User

關於oracle讀取資料時,自動進行HASH處理的隱含參數:_gby_hash_aggregation_enabled,
一、這個參數的發展史:_gby_hash_aggregation_enabled

Oracle 11.1.0:
Parameter Name: _gby_hash_aggregation_enabled
Description: enable group-by and aggregation using hash scheme
Type: BOOL Obsoleted: FALSE
Can ALTER SESSION: TRUE Can ALTER SYSTEM: IMMEDIATE
Oracle 10.2.0:
Parameter Name: _gby_hash_aggregation_enabled
Description: enable group-by and aggregation using hash scheme
Type: BOOL Obsoleted: FALSE
Can ALTER SESSION: TRUE Can ALTER SYSTEM: IMMEDIATE
Oracle 10.1.0:No such parmeter in Oracle 10.1.0.Oracle 9.2.0:No such parmeter in Oracle 9.2.0.Oracle 8.1.7:No such parmeter in Oracle 8.1.7.Oracle 8.0.6:No such parmeter in Oracle 8.0.6.Oracle 7.3.4:

No such parmeter in Oracle 7.3.4.

二、關於ORACLE的兩種group by 方式:hash group by 與 sort group by

Oracle10g在distinct操作時作了演算法改進,使用Hash Unique 代理了以前的Sort Unique.該行為由隱藏參數”_gby_hash_aggregation_enabled”決定,optimizer_features_enable設定為10.2.0.1時預設為TRUE.

HASH UNIQUE 的CPU COST應該比SORT UNIQUE要低,同理常用HASH JOIN而少用SORT MERGE JOIN。

SQL>  create table t as select * from dba_users;
Table created.

SQL> set autotrace on
SQL> select distinct password from t;
———————————–
| Id  | Operation          | Name |
———————————–
|   0 | SELECT STATEMENT   |      |
|   1 |  SORT UNIQUE       |      |
|   2 |   TABLE ACCESS FULL| T    |
———————————–
Note
—–
- rule based optimizer used (consider using cbo)
Statistics
———————————————————-
1  recursive calls
0  db block gets
3  consistent gets
1  physical reads
0  redo size
752  bytes sent via SQL*Net to client
469  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
1  sorts (memory)
0  sorts (disk)
9  rows processed

 RBO模式下,仍然要做SORT,使用的是 SORT UNIQUE

SQL> show parameters opt
NAME                                 TYPE        VALUE
———————————— ———– ——————————
optimizer_features_enable            string      10.2.0.1
optimizer_mode                       string      RULE

SQL> alter session set optimizer_mode = choose;
Session altered.

SQL> analyze table t compute statistics;
Table analyzed.

SQL> select distinct password from t;

Execution Plan
———————————————————-
Plan hash value: 1901613472
—————————————————————————
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
—————————————————————————
|   0 | SELECT STATEMENT   |      |     9 |   144 |     3  (34)| 00:00:01 |
|   1 |  HASH UNIQUE       |      |     9 |   144 |     3  (34)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T    |     9 |   144 |     2   (0)| 00:00:01 |
—————————————————————————
Statistics
———————————————————-
1  recursive calls
0  db block gets
3  consistent gets
0  physical reads
0  redo size
752  bytes sent via SQL*Net to client
469  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
9  rows processed

HASH UNIQUE避免了排序,在資料量很大的時候應該能夠看到較低的%CPU COST

SQL>  ALTER SESSION SET “_gby_hash_aggregation_enabled” = FALSE;
SQL>  select distinct password from t;
Execution Plan
———————————————————-
Plan hash value: 965418380
—————————————————————————
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
—————————————————————————
|   0 | SELECT STATEMENT   |      |     9 |   144 |     3  (34)| 00:00:01 |
|   1 |  SORT UNIQUE       |      |     9 |   144 |     3  (34)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T    |     9 |   144 |     2   (0)| 00:00:01 |
—————————————————————————
Statistics
———————————————————-
1  recursive calls
0  db block gets
3  consistent gets
0  physical reads
0  redo size
752  bytes sent via SQL*Net to client
469  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
1  sorts (memory)
0  sorts (disk)
9  rows processed


三、hash group by的bug及解決辦法(轉載)

由於本人還沒有遇到相關的bug,所以在這裡就先引用前輩的經驗。希望前輩不要介意。


 在10gR2中,group by由以前的sort group by改成了hash group by,這種演算法上的改進,取消了sort group by必須進行的排序操作,即然是用hash演算法,就存在碰撞的可能性,itpub的godlessme就碰到這樣的問題,應該算是bug吧。

    下面給大家示範一下如何解決這種問題,其實要解決hash group by引起的排序不準確的問題,就是還用以前的sort group by就可以啦,10gR2中引入_gby_hash_aggregation_enabled隱藏參數,該參數預設設定為true,將它改成false即可。

    SQL> select status,count(*) from tmp_object group by status;
    STATUS COUNT(*)
    ---- -----
    INVALID 29
    VALID 10236

    Execution Plan
    -----------------------------
    Plan hash value: 3490974944
    -------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    -------------------------------------
    | 0 | SELECT STATEMENT | | 2 | 12 | 35 (6)| 00:00:01 |
    | 1 | HASH GROUP BY | | 2 | 12 | 35 (6)| 00:00:01 |
    | 2 | TABLE ACCESS FULL| TMP_OBJECT | 10265 | 61590 | 33 (0)| 00:00:01 |
    -------------------------------------
    Statistics
    -----------------------------
    24 recursive calls
    0 db block gets
    136 consistent gets
    0 physical reads
    0 redo size
    522 bytes sent via SQL*Net to client
    385 bytes received via SQL*Net from client
    2 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    2 rows processed

    SQL> col ksppinm format a39
    SQL> col ksppstvl format a39
    SQL> select ksppinm, ksppstvl 
    2 from x$ksppi pi, x$ksppcv cv 
    3 where cv.indx=pi.indx and pi.ksppinm like '_%' escape ''
    4 and pi.ksppinm like '%&parameter%';
    Enter value for parameter: gby
    old 4: and pi.ksppinm like '%&parameter%'
    new 4: and pi.ksppinm like '%gby%'

    KSPPINM KSPPSTVL
    -------------------- ------------
    _gby_onekey_enabled TRUE
    _gby_hash_aggregation_enabled TRUE

    SQL> alter session set "_gby_hash_aggregation_enabled"=false;
    Session altered.
    SQL> select status,count(*) from tmp_object group by status;
    STATUS COUNT(*)
    ---- -----
    INVALID 29
    VALID 10312

    Execution Plan
    -----------------------------
    Plan hash value: 1360369603
    -------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    -------------------------------------
    | 0 | SELECT STATEMENT | | 10860 | 54300 | 32 (7)| 00:00:01 |
    | 1 | SORT GROUP BY | | 10860 | 54300 | 32 (7)| 00:00:01 |
    | 2 | TABLE ACCESS FULL| TMP_OBJECT | 10860 | 54300 | 30 (0)| 00:00:01 |
    -------------------------------------
    Statistics
    -----------------------------
    0 recursive calls
    0 db block gets
    134 consistent gets
    0 physical reads
    0 redo size
    522 bytes sent via SQL*Net to client
    385 bytes received via SQL*Net from client
    2 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    2 rows processed

轉載地址:http://tech.it168.com/db/o/2006-11-12/200611122129197.shtml

       其實在我看來,預設情況下Oracle會按資料區塊讀取表中的資料,而我們在存入資料時有部分可能會按遞增或遞減的順序在資料區塊中排列。當我們從資料區塊中讀取資料時其實應該是有規律的順序,或者說SORT group by 的排序就會滿足我們的需求,而且對於小表來講,這樣的讀取不會對效能有很大的影響。但是,從10g以後預設讀取方式就變成了hash group by,導致本來可以按順序讀取的資料,還要加上order by 在記憶體中排序。而且,在對大表進行讀取時,有可能報出

ORA-00600: internal error code, arguments: [32695], [hash aggregation can't be done], [], [], [], [], [], []的錯誤。

對於這個錯誤,我們只能講參數_gby_hash_aggregation_enabled的預設值改為false。

附錄:如何查看隱含參數:

SQL> SELECT x.ksppinm NAME, y.ksppstvlVALUE, x.ksppdesc describ

    FROM SYS.x$ksppi x, SYS.x$ksppcv y

  WHEREx.inst_id = USERENV ('Instance')

   AND y.inst_id = USERENV ('Instance')

   AND x.indx = y.indx

   AND x.ksppinm LIKE '%xxx%';


相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.