In Hekaton, select the number of hash buckets correctly

Source: Internet
Author: User

Today I'm using a hash index of 2048 buckets, inserting 1 million of records into Hakaton, and testing how hash collisions (hash collision) affect Hekaton's workload in the hash bucket count--the results are very, very interesting. First I want to introduce what a hash conflict is.

As you might know (hopefully), in SQL Server 2014, the Hakaton table is implemented as a hash index (hash Indexes). Wikipedia has a detailed introduction to this, which is the basis for the application of hash indexes. The hash function maps the index key to the corresponding bucket in the hash index, and the result of the hash function determines that your line is eventually put into that hash bucket. If multiple key values are hashed to the same value, SQL Server is inserted in that hash bucket, where there are multiple portals linked together. Take a look at the following illustration (from Wikipedia):

As you can see, the key value "John Smith" and "Sandra Dee" hash to the same bucket--here is the number 152th bucket. This means that all 2 rows are in the same hash bucket, which affects insert performance and the query performance of SELECT. During INSERT, SQL Server needs to maintain a list of links, and SQL Server needs to scan the list of links during a select query.

After describing the hash conflict, let's use a simple example to demonstrate the impact of the hash conflict on performance. Let's create a database with a Hekaton table:

1 --Create New Database2 CREATE DATABASEhashcollisions3 GO4 5 --ADD memory_optimized_data filegroup to the database.6 ALTER DATABASEhashcollisions7 ADDFILEGROUP HekatonfilegroupCONTAINSMemory_optimized_data8 GO9 Ten  Usehashcollisions One GO A  - --Add A new file to the previous created file group - ALTER DATABASEHashcollisionsADD FILE the ( -NAME=N'Hekatoncontainer',  -FILENAME=N'C:\Program Files\Microsoft SQL Server\mssql12. Mssqlserver\mssql\data\hashcollisionscontainer' - ) +  toFILEGROUP[Hekatonfilegroup] - GO +  A --Create a simple table at CREATE TABLEtesttable - ( -Col1INT  not NULL PRIMARY KEY nonclusteredHASH with(Bucket_count= 1024x768), -Col2INT  not NULL, -Col3INT  not NULL - ) in  with - ( toMemory_optimized=  on,  +Durability=schema_only - ) the GO

as you can see from the code, I'm using 1024 of the number of buckets--not many barrels, and then I'll insert a 1000000 record into the table. Next I will create a natively compiled stored procedure so that I can use the Hekaton thief fast Speed:

1 --Create A native compiled Stored Procedure2 CREATE PROCEDUREInserttestdata3  with 4 Native_compilation,5 SCHEMABINDING,6     EXECUTE  asOWNER7  as 8 BEGIN9ATOMIC with Ten     ( One         TRANSACTION  A         Isolation  Level =SNAPSHOT, -LANGUAGE=N'us_english' -     ) the  -     DECLARE @i INT = 0 -      -      while @i < 1000000 +     BEGIN -         INSERT  intoDbo. TestTable (Col1, Col2, Col3)VALUES(@i,@i,@i) +  A         SET @i += 1 at     END - END - GO

As you can see, here I use a simple loop to insert 1 million records. On a virtual machine with 4 cores of cpu,4g memory, we turn on time statistics to execute this stored procedure:

1 SET STATISTICS  on 2 3 EXEC dbo. Inserttestdata

The execution time is almost 42 seconds, which is already very slow. We are doubling the number of barrels to 1048576, and you will see that as the number of barrels increases, the performance has been improved continuously.

1 DROP PROCEDUREdbo. Inserttestdata2 DROP TABLEdbo. TestTable3 4 --Create a simple table5 CREATE TABLEtesttable6 (7Col1INT  not NULL PRIMARY KEY nonclusteredHASH with(Bucket_count= 1048576),8Col2INT  not NULL,9Col3INT  not NULLTen ) One  with A ( -Memory_optimized=  on,  -Durability=schema_only the ) - GO -  -  + --Create A native compiled Stored Procedure - CREATE PROCEDUREInserttestdata +  with  A Native_compilation, at SCHEMABINDING, -     EXECUTE  asOWNER -  as  - BEGIN -ATOMIC with  -     ( in         TRANSACTION  -         Isolation  Level =SNAPSHOT, toLANGUAGE=N'us_english' +     ) -  the     DECLARE @i INT = 0 *      $      while @i < 1000000Panax Notoginseng     BEGIN -         INSERT  intoDbo. TestTable (Col1, Col2, Col3)VALUES(@i,@i,@i) the  +         SET @i += 1 A     END the END + GO

We continue to execute this stored procedure:

1 SET STATISTICS  on 2 3 EXEC dbo. Inserttestdata

Executing the same stored procedure takes only 780 milliseconds and runs with the first 1024-bucket test, which is a big difference. You can also use the DMV Sys.dm_db_xtp_hash_index_stats to see how many buckets are used in your hash index:

1 SELECT *  from Sys.dm_db_xtp_hash_index_stats

What does this test tell us? To make the right choice for the number of buckets to Hekaton for hash indexes, because they can significantly affect SQL Server performance! The best number of buckets should be the number of different values in the hash index--and keep some free space (a little bit on the number of different values) for security reasons. You also can't put the number of buckets that are too high, because instead you're wasting memory. Almost every setting in SQL Server is based on your workload, except for database shrinkage.

Thanks for your attention!

In Hekaton, select the number of hash buckets correctly

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.