Hash joins (hash join) is a table join method in which two tables rely primarily on hashing to obtain a concatenated result set when making a table connection.
For a sort merge connection, if the result set of the two tables is large and needs to be sorted after the predicate condition specified in the target SQL is applied, the execution efficiency of the sort merge connection must be low, whereas for nested loops, if the driver table has a large number of records for the drive result set, Even if there is an index on the connection column of the drive table, the execution efficiency of using nested loops joins is equally low. To solve this problem, Oracle introduced a hash connection. In Oracle 10g and later versions, the optimizer (in fact the CBO, because the hash connection is only applicable to the CBO) considers that the Hashilen acceptance is limited to the implied parameter _hash_join_enabled when parsing the target SQL, and the default value is true.
The pros and cons of a hash join and the scenario that applies are:
A, hash joins are not necessarily sorted, or in most cases do not need to be sorted
b, the connection column that corresponds to the hash-connected driver table is as selective as possible.
C, the hash can only be used for the CBO, and can only be used for equivalent connection conditions. (Even if the hash is an inverse connection, Oracle actually converts it to an equivalent connection).
C, hash connections are suitable for small tables and large tables to make a connection between the number of records connected to the result set, especially when the small table is very good selectivity, this time the hash connection execution time can be approximated as the full table scan all large table cost time equivalent.
E, when two hashes are connected, if the result set with the specified predicate condition in the target SQL is applied with a small amount of data, the hash table can be fully accommodated in memory (the PGA's workspace), at which point the hash connection is executed very efficiently.
The hash join for the connection between Oracle tables, with the following characteristics:
1, both the driver table and the driver table are accessed at most once.
2, the hash-connected table has a drive sequence.
3, the table connected to the Hashtable does not need to be sorted, but when he does a hash before making the connection, it uses hash_area_size to create the hash table.
4, the hash connection does not apply to the connection condition is: not equal to <>, greater than;, less than <, less than or equal to <=, greater than or equal to >=,like.
5, the hash join index column has no special requirements in the table connection, as in the case of a single table.
Let me do an experiment to confirm the conclusion as follows:
The specific test base table please check my blog following links:
----of Oracle table joins nested loops (Nested Loops Join)
Test T2 table is only accessed 1 times
Sql> Select/*+ Leading (T1) use_hash (T2) */* from t1,t2 where t1.id=t2.t1_id;
The resulting recordset for execution is omitted here
Sql> Select sql_id, Child_number, sql_text from V$sql where Sql_text like '%use_hash (T2)% ';
sql_id Child_number Sql_text
------------- ------------ --------------------------------------------------------------------------------
7d64k5stnc3sk 0 Select sql_id, Child_number, sql_text from V$sql where Sql_text like '%use_hash
036fyatp73h9n 0 Select/*+ leading (T1) use_hash (T2) */* from T1,T2 where t1.id=t2.t1_id
Sql> select * FROM table (dbms_xplan.display_cursor (' 036fyatp73h9n ', 0, ' allstats last '));
Plan_table_output
--------------------------------------------------------------------------------
sql_id 036fyatp73h9n, child number 0
-------------------------------------
Select/*+ Leading (T1) use_hash (T2) * * from T1,T2 where t1.id=t2.t1_id
Plan Hash value:1838229974
--------------------------------------------------------------------------------
| Id | Operation | Name |starts| E-rows | A-rows | A-time | Buff
--------------------------------------------------------------------------------
|* 1 | HASH JOIN | | 1 | 100 | 100 |00:00:00.04 | 1
| 2 | TABLE ACCESS full| T1 | 1 | 100 | 100 |00:00:00.01 |
| 3 | TABLE ACCESS full| T2 |1| 100k| 100k|00:00:00.01 | 1
--------------------------------------------------------------------------------
predicate information (identified by Operation ID):
---------------------------------------------------
1-access ("T1". ID "=" T2 "." t1_id ")
Note
Plan_table_output
--------------------------------------------------------------------------------
-----
-Dynamic sampling used for this statement
Rows selected
As can be seen from the execution plan above, the driver table and the driver table are only accessed 1 times in the hash connection.
The following experiment proves that both the driver table and the driver table are accessed 0 times.
Sql> Select/*+ Leading (T1) use_hash (T2) */* from T1,T2 where t1.id=t2.t1_id and 1=2;
ID NUM Information ID t1_id NUM Information
---------- ---------- -------------------------------------------------------------------------------- ---------- - --------- ---------- --------------------------------------------------------------------------------
Sql> Select sql_id, Child_number, sql_text from V$sql where Sql_text like '%use_hash (T2)% ';
sql_id Child_number Sql_text
------------- ------------ --------------------------------------------------------------------------------
7d64k5stnc3sk 0 Select sql_id, Child_number, sql_text from V$sql where Sql_text like '%use_hash
Cknub2x1sx8tn 0 Select/*+ leading (T1) use_hash (T2) * * from T1,T2 where t1.id=t2.t1_id and 1=2
2jhn0mg57v1tz 0 Select sql_id, Child_number, sql_text from V$sql where Sql_text like '%use_hash
036fyatp73h9n 0 Select/*+ leading (T1) use_hash (T2) */* from T1,T2 where t1.id=t2.t1_id
Sql> select * FROM table (dbms_xplan.display_cursor (' Cknub2x1sx8tn ', 0, ' allstats last '));
Plan_table_output
--------------------------------------------------------------------------------
sql_id Cknub2x1sx8tn, child number 0
-------------------------------------
Select/*+ Leading (T1) use_hash (T2) * * from T1,T2 where t1.id=t2.t1_id and 1=2
Plan Hash value:487071653
--------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-rows | A-rows | A-time | Om
--------------------------------------------------------------------------------
|* 1 | FILTER | | 1 | | 0 |00:00:00.01 |
|* 2 | HASH JOIN | | 0 | 100 | 0 |00:00:00.01 | 7
| 3 | TABLE ACCESS full| T1 | 0 | 100 | 0 |00:00:00.01 |
| 4 | TABLE ACCESS full| T2 | 0 | 100k| 0 |00:00:00.01 |
--------------------------------------------------------------------------------
predicate information (identified by Operation ID):
---------------------------------------------------
1-filter (null is NOT NULL)
2-access ("T1". ID "=" T2 "." t1_id ")
Plan_table_output
--------------------------------------------------------------------------------
Note
-----
-Dynamic sampling used for this statement
Rows selected
The above two execution plans can conclude that the driver table and the driver table are only accessed 1 or 0 times in a hash connection.