Sort warnings for in-memory OLTP (Hekaton)

Source: Internet
Author: User
Tags filegroup table definition

In-memory OLTP is about everything in memory. But that's only half right. In today's article I want to show you that when you read data from memory, even in-memory OLTP can cause disk activity. The problem here is the combination of incorrect statistics and sorting (sort) operators in the execution plan.

sorting (sort) operator issues

As we all know, the sort operator requires a so-called memory Grant to run it. The memory area is used to sort the incoming records in the execution plan. The size of the memory grant is based on the estimated number of rows. During cardinality calculations (cadinality estimation), the query optimizer estimates the estimated number of rows for each operator in the execution plan.

I wrote an article in June this year showing how the sort operator can overflow to tempdb when estimating errors. The same thing happens in in-memory OLTP: When the estimated number of rows is wrong, there is a sort (sort) operator in the execution plan, and the sort operator overflows to tempdb! Let's reproduce the situation.

in-memory OLTP overflows to tempdb

We create a new database with an in-memory OLTP file group configuration.

1 --Create New Database2 CREATE DATABASEhashcollisions3 GO4 5 --ADD memory_optimized_data filegroup to the database.6 ALTER DATABASEhashcollisions7 ADDFILEGROUP InmemoryoltpfilegroupCONTAINSMemory_optimized_data8 GO9 Ten  Usehashcollisions One GO A  - --Add A new file to the previously created file group - ALTER DATABASEHashcollisionsADD FILE the ( -NAME=N'Inmemoryoltpcontainer',  -FILENAME=N'C:\Program Files\Microsoft SQL Server\mssql12. Sql2014\mssql\data\inmemoryoltpcontainer' - ) +  toFILEGROUP[Inmemoryoltpfilegroup] - GO

Next I create a new memory-optimized table:

1 --Create a test table2 CREATE TABLETable13 (4Column1INT IDENTITY,5Column2INT6     CONSTRAINTPk_column1PRIMARY KEY nonclusteredHASH (COLUMN1) with(Bucket_count= 1) 7) with8 (9Memory_optimized=  on,TenDurability=schema_only One ) A GO

As you can see from the table definition, I created a hash index in the Column1 column. Because it is a hash index, you also need to specify the number of hash buckets you want to have on the hash table. Here I specify a hash bucket, which is a very, very bad practice. When you insert a record into the table, because there are only 1 hash buckets, you get a huge number of hash collisions (hash collisions). In general, on the column where you define your hash index, the number of hash buckets should match the number of unique values on your column. The following code inserts 14,001 records into the table you just created.

1 --Insert 14001 Records2 INSERT  intoTable1 (COLUMN2)VALUES(1)3 4 SELECT TOP 14000 IDENTITY(INT,1,1) asN into#Nums5  from6 Master.dbo.syscolumns SC17 8 INSERT  intoTable1 (COLUMN2)9 SELECT 2  from#numsTen DROP TABLE#nums One GO

You can view the number of hash collisions through the DMV sys.dm_db_xtp_hash_index_stats . From the output of this DMV you can see that you have 14,001 records on this hash bucket with a unique hash index. Now let's run the SELECT statement with the sort operator in the execution plan.

1 -- The sort operator in this execution plan spills through to tempdb! 2 SELECT *  from Table1 3 ORDER  by Column1

Now when you look at the execution plan, you'll see that the sort operator has overflowed to tempdb.

This is because incorrect statistics on the hash index occur. When you look at the Index Scan (nonclusteredhash) operator property in the execution plan, you see that the query optimizer estimates the number of rows from our hash index to 1, and we actually return 140001 rows.

The estimate of the Index Scan (nonclusteredhash) operator is always based on the hash table hash bucket count. The query optimizer makes the assumption that you do not have a hash conflict (hash collisions)-This is not true here. Therefore, the memory grant for the sort operator is based on that incorrect estimate, which overflows to tempdb. This query runs for nearly 80 milliseconds in my system, which is a long time for in-memory technology.

How do you fix this problem? Delete your table and carefully plan the number of hash buckets in the hash index. Welcome to the exciting world of in-memory OLTP ...

Summary

When you use the hash index of in-memory OLTP, you need to carefully design your hash bucket number for your hash. When they are wrong, it is injury and performance. I have written an article 1 months ago describing how hash collisions (hash collisions) can hurt the performance of in-memory OLTP-even if there is no overflow to tempdb!

As we can see here: there is a hash conflict in the hash index, you can not expect to get amazing fast performance with in-memory OLTP, because they bring a huge burden and affect the cardinality calculation.

Thanks for your attention!

Sort warnings for in-memory OLTP (Hekaton)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.