SQL Server2014 Hash Index principle
translated from : http://www.sqlservercentral.com/blogs/sql-and-sql-only/2015/09/08/hekaton-part-6-hash-indexes-intro/
As with the hash join, the principle of hash aggregation, understanding the principle of hash index will also understand the principle of hash join and hash aggregation
The new index type introduced by SQL Server 2014 is called Hash index. Before introducing hash index, it is important to introduce the hash function so that we can understand the principle of hash index.
When a Key-value key value pair is passed to a hash function, after the calculation of the hash function, the Key-value key value pair is placed in the appropriate hash buckets (hash bucket) according to the result.
Give me a chestnut.
We assume that the 10 modulo (% 10) is the hash function. If the key of the Key-value value pair is 1525 and is passed to the hash function, then 1525 will be stored in the fifth bucket.
Because 5 as 1525 10 = 5.
Similarly, 537 will be stored in the seventh bucket, 2982 will be stored in the second bucket, and so on
Similarly, in the hash index, the hash indexed column is passed to the hash function to make a match (similar to the HashMap map operation in Java), after the match succeeds,
The index column is stored in the table in the matching hash bucket, which has the actual data row pointer, and then finds the corresponding data row based on the actual data row pointer.
To summarize, to find a row of data or to work with a WHERE clause, the SQL Server engine needs to do several things
1. Generate the appropriate hash function according to the parameters inside the Where condition
2, the index column to match, matching to the corresponding hash bucket, find the corresponding hash bucket means also found the corresponding data row pointer (row pointer)
3. Read data
The hash index is simpler than the B-tree index because it does not need to traverse the B-tree, so the access speed is faster
Examples of hash functions and corresponding syntaxes
CREATE TABLEdbo. HK_TBL ([ID] INT IDENTITY(1,1) not NULL PRIMARY KEY nonclusteredHASH with(Bucket_count= 100000 ) , [Data] Char( +) COLLATE latin1_general_100_bin2NULL , [DT] datetime not NULL, ) with(memory_optimized= on, Durability=schema_and_data);
In SQL Server 2014, memory-optimized tables cannot be hashed after they are created, but the hash index is added after the table is created in SQL Server 2016, but
Adding a hash index is an offline operation.
Number of buckets in a hash index
(Bucket_count = 100000) defines the number of buckets that a hash index can use, a bucket that is fixed and the number of buckets specified by the user,
Instead of executing the query, SQL Server determines the number of buckets generated. The number of buckets is always 2 of the rounding (1024x768, 2048, 4096 etc).
The hash index of SQL Server2014 is actually similar to the principle of MySQL's Adaptive Hash Index, in order to get rid of the bondage of B-tree and make the search more efficient.
How does a relational database work This article also describes the principle of hash join, you can see
Related articles
Java hashmap that little thing
How does a relational database work
If there is a wrong place, welcome everyone to shoot brick O (∩_∩) o
SQL Server2014 Hash Index principle