1 Overview
Hash join has the same features as merge join, and requires an equivalent condition. When the index cannot be hit on the connection bar, or the join of a large set, the nested join and merge join may not achieve good performance. In this case, we need to consider using hash join.
2 BasicAlgorithm
Hash join is divided into two phases: Build and probe. In the build stage, a set is used as the build set, hash build table columns on the connection condition, and the results are stored in the memory (named build hash table ). in the probe stage (name the second set as probe set), the hash probe set column of each row on the connection condition is compared with the build hash table. If it is equal, return.
PseudoCode:
For each row R1 in the build table
Begin
Calculate hash value on R1 join
Key (s)
Insert R1 into the appropriate hash
Bucket
End
For each row R2 in the probe table
Begin
Calculate hash value on R2 join
Key (s)
For each row R1 in the corresponding
Hash Bucket
If R1 joins
With r2
Return (R1, R2)
End
3
Example
Test Data
View code
Create Table T1 ( Int , B Int , X Char ( 200 )) Create Table T2 ( Int , B Int , X Char ( 200 )) Create Table T3 ( Int , B Int , X Char ( 200 )) Set Nocount On Declare @ I Int Set @ I = 0 While @ I < 1000 Begin Insert T1 Values ( @ I * 2 , @ I * 5 , @ I ) Set @ I = @ I + 1 End Set @ I = 0 While @ I < 10000 Begin Insert T2 Values ( @ I * 3 , @ I * 7 , @ I ) Set @ I = @ I + 1 End Set @ I = 0 While @ I < 100000 Begin Insert T3 Values ( @ I * 5 , @ I * 11 , @ I ) Set @ I = @ I + 1 End
Execute SQL:
Set StatisticsProfileOnSelect *From(T1Inner JoinT2OnT1.a=T2.a)Inner JoinT3OnT1. B=T3.aOption(HashJoin)
Execution result: