Hash join of three physical connections of SQL Server (iii)

Source: Internet
Author: User

Brief introduction

In SQL Server 2012, in some special cases, you will see the following icon:

The Hash join is divided into two stages, namely the generation and detection phases.

The first is the build phase, and each entry in the input source is calculated by hashing the hash function into a different hash bucket, where the hash function selection and the number of hash buckets are black boxes, generally speaking, The query optimizer uses whichever input set that is smaller on both sides of the connection as the input source for the first stage.

Next is the probing phase, for another input set, the same hash function for each row, determine the hash bucket that it should be in, match each row in this row and corresponding hash bucket, and return the corresponding row if matched.

By understanding the principle of hash matching is not difficult to see, hash matching involves the hash function, so the CPU consumption will be very high, in addition, in the hash bucket row is unordered, so the output is unordered.

Sample Demo:

IF object_id (' dbo. Tbl10 ') is not a NULL DROP TABLE dbo. TBL10; CREATE TABLE dbo. TBL10 (Id int IDENTITY (), Val int, Fill CHAR (7000) not NULL DEFAULT REPLICATE (' Fill ', 1750)); IF object_id (' dbo. Tbl100 ') is not a NULL DROP TABLE dbo. Tbl100; CREATE TABLE dbo. Tbl100 (Id int IDENTITY (), Val int, Fill CHAR (7000) not NULL DEFAULT REPLICATE (' Fill ', 1750)); INSERT into dbo. Tbl10 (Val) SELECT TOP (Ten) 1+row_number () Over (ORDER by (SELECT 1))%100from sys.all_columns A, Sys.all_columns B, Sys.all_  Columns C; INSERT into dbo. Tbl100 (Val) SELECT TOP (row_number) over (ORDER by (SELECT 1)) from Sys.all_columns A, Sys.all_columns B, Sys.all_ Columns C; SELECT Index_type_desc, Alloc_unit_type_desc, Index_depth, Page_count, Record_countfrom Sys.dm_db_index_physical_ Stats (db_id (), object_id (' dbo. Tbl10 '), Null,null, ' SAMPLED '); SELECT Index_type_desc, Alloc_unit_type_desc, Index_depth, Page_count, Record_countfrom Sys.dm_db_index_physical_ Stats (db_id (), object_id (' dbo. Tbl100 '), Null,null, ' SAMPLED ');

Use the hash join to execute the following query

SET STATISTICS IO on; Goselect * FROM dbo. Tbl100 ainner HASH JOIN dbo. Tbl10 BON a.val = b.val; Goset STATISTICS IO OFF;

To view statistical information:

Table ' worktable '. Scan count 0, logical reads 0, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB Read-a Head reads 0.
Table ' Tbl10 '. Scan count 1, logical reads, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB read- Ahead reads 0.
Table ' Tbl100 '. Scan count 1, logical reads, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB read -ahead reads 0.

Advantages and Disadvantages

• All logical connection types can be processed.

• It does require a fairly large amount of memory, large enough to fit the entire left input. (may be in tempdb)

• During the entire process of creating a hash index, no row will be returned.

There is at least one column in the join condition that needs to use an equal comparison.

Because of the expensive build-up phase, this is the most resource-intensive connection algorithm. However, once the hash table is established, this connection algorithm can be very fast. For large tables that do not have an index available, the time savings during the probing phase will exceed the additional cost of the construction phase. However, keep in mind that because of the large memory requirements, only a very small number of these can run at the same time.

Hash join of three physical connections of SQL Server (iii)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.