How to optimize SQL Server queries and SQL Server

Source: Internet
Author: User

How to optimize SQL Server queries and SQL Server

In today's article, I want to show you how to convey your work and thinking process to the query optimizer when you want to create an index for a specific query. Let's discuss it together!

Problematic Query
Let's look at the following queries:

 DECLARE @i INT = 999 SELECT   SalesOrderID,    SalesOrderDetailID,   CarrierTrackingNumber,    OrderQty,    LineTotal FROM Sales.SalesOrderDetail WHERE ProductID < @i ORDER BY CarrierTrackingNumber GO

As you can see, a local variable and a non-equivalent predicate are used to obtain some records from the Sales. SalesOrderDetail table. When you execute the query and view its execution plan, you will find that it has some serious problems:

  • SQL Server needs to scan the entire non-clustered index of the Sales. SalesOrderDetail table because no non-clustered index is supported. For this scan, the query requires 1382 logical reads and the running time is nearly 800 milliseconds.
  • The Filter operator is introduced in the query plan by the query optimizer. It performs row-by-row comparison to check the conforming rows (ProductID <@ I)
  • Because order by CarrierTrackingNumber, A Sort operator in the execution plan is introduced.
  • The sorting operator spreads to TempDb because the Cardinality Estimation is incorrect ). SQL Server uses a combination of local variables and non-equivalent predicates to estimate 30% rows from the base code of the table. In our case, the estimated number of rows is 36395 (121317 * 30% ). In fact, the query returns 120621 rows, which means that the Sort operator must spread to TempDb because the requested memory is too small.

Now I ask you -- can you improve this query? What is your suggestion? Take a few minutes to rest. Without modifying the query itself, how do you improve this query?

Let's debug the query!
Of course, we need to make index-related adjustments for improvement. No non-clustered index is supported. It can only be the only query optimizer that can run our query using the plan. But what is a good non-clustered index for this specified query? Generally, I look at the search predicates to consider possible non-clustered stenography. In our example, the search predicates are as follows:

WHERE ProductID <@ I

The row that we request to filter in the ProductID column. Therefore, we want to create a supported non-clustered index in that column. We create an index:

CREATE NONCLUSTERED INDEX idx_Test ON Sales.SalesOrderDetail(ProductID) GO

After a non-clustered index is created, we need to verify the changes, so we will execute the query code again. How can I squeeze the results? The query optimizer does not use the non-clustered index We just created! We have created a supported non-clustered index on the search predicates. Does the query optimizer reference it? People usually have no such thing. In fact, we can prompt the query optimizer to use non-clustered indexes to better understand why the query optimizer does not automatically select indexes:

 DECLARE @i INT = 999  SELECT  SalesOrderID,   SalesOrderDetailID,  CarrierTrackingNumber,   OrderQty,   LineTotalFROM Sales.SalesOrderDetail WITH (INDEX(idx_Test))WHERE ProductID < @i ORDER BY CarrierTrackingNumber GO

When you look at the execution plan, you will see the following wild-a parallel plan:

Query takes 370109 logical reads! The running time is basically the same as the previous one. What happened here? When you carefully look at the execution plan, you will find that the query optimizer introduces the bookmarksearch, because the created non-clustered index is not a covering non-clustered index for the query. The query goes beyond the so-called Tipping Point, because we use the current search predicates to obtain almost all rows. Therefore, it is meaningless to combine non-clustered indexes and bookmarks.

Without thinking about why the query optimizer does not select the created non-clustered index, we have expressed our ideas to the query optimizer itself and asked the query optimizer through the query prompt, why are non-clustered indexes not automatically selected. As I said at the beginning: I don't want to think too much about it.

To solve this problem using non-clustered indexes, on the leaf layer of non-clustered indexes, we must include additional columns requested from the SELECT list. You can check the bookmarks again to find out which columns on the leaf layer are currently lost:

  • CarrierTrackingNumber
  • OrderQty
  • UnitPrice
  • UnitDiscountPrice

We recreate the non-clustered index:

CREATE NONCLUSTERED INDEX idx_Test ON Sales.SalesOrderDetail(ProductID)INCLUDE (CarrierTrackingNumber, OrderQty, UnitPrice, UnitPriceDiscount) WITH( DROP_EXISTING = ON )GO


We have made another change, so we can re-run the query to verify it. However, the query prompt is not added this time, because the query optimizer will automatically select non-clustered indexes. How can I squeeze the results? When you look at the execution plan, the index is selected now.

SQL Server now performs a search operation on a non-clustered index, but we also have the Sort operator in the execution plan. Because the base number is 30% hard-coded, the sorting (Sort) still needs to spread to TempDb. My God! Our logical reads have dropped to 757, but the running time is still nearly 800 milliseconds. What should you do now?

Now we can try to directly include the CarrierTrackingNumber column in the navigation structure of a non-clustered index. This is the column of the SQL Server sorting operator. When we directly add this column (as the primary key) in a non-clustered index, we physically Sort the column, so the Sort operator should disappear. As a positive side effect, it will not spread to TempDb. In the Execution Plan, there are no operators concerned with incorrect base calculation. So let's try that assumption and re-create a non-clustered index:

 CREATE NONCLUSTERED INDEX idx_Test ON Sales.SalesOrderDetail(CarrierTrackingNumber, ProductID)INCLUDE (OrderQty, UnitPrice, UnitPriceDiscount) WITH(   DROP_EXISTING = ON )GO

From the index definition, we can see that we have physically pre-ordered the data in the CarrierTrackingNumber and ProductID columns. When you re-execute the query, you will see that the Sort operator disappears when you view the execution plan, SQL Server scans the entire leaf layer of a non-clustered index (using the residual predicates (residual predicate) as the search predicates ).

This execution plan is not bad! We only need 763 logical reads, and now the running time has been reduced to 600 milliseconds. It has improved by 25% compared with the previous one! However, the query optimizer recommends that you use a better non-clustered Index. We recommend that you use the Missing Index Recommendations )! For the moment, I believe that we have created the suggested non-clustered index:

CREATE NONCLUSTERED INDEX [SQL Server doesn't care about names, why I should care about names?]ON [Sales].[SalesOrderDetail] ([ProductID])INCLUDE ([SalesOrderID],[SalesOrderDetailID],[CarrierTrackingNumber],[OrderQty],[LineTotal])GO

When you re-execute the initial query, you will find that the surprising thing is that the query optimizer uses the non-clustered index We just created. The suggestion for missing Indexes has gone away!

You have just created an index that SQL Server never uses-except the INSERT, UPDATE, and DELETE statements, SQL Server must maintain your non-clustered index. For your database, you just created a "simple" waste of space index. On the other hand, you have met the query optimizer by eliminating the recommendation of missing indexes. But this is not the purpose: to create an index that will be used again.

Conclusion: The query optimizer is never trusted!

Summary

Today's article is a bit controversial, but I want to show you how the query optimizer helps you when you create an index, and how the query optimizer fooled you. Therefore, make a small adjustment and run your query immediately. It is very important to verify the change.

The above is all the content of this article, hoping to help you learn.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.