SQL Server Performance Tuning Execution plan (execution plan) tuning

Source: Internet
Author: User
Tags one table

There are three join strategies for SQL Server: Hash join,merge join,nested Loop join.

hash Join: Used to handle data that is not ordered/not indexed, it creates a hash table in memory of the data (the associated key) on both sides of the Join. For example, with the following query statement, the associated two tables are not indexed, and the execution plan is displayed as a hash Join.

[SQL]
    1. SELECT
    2. sh.*
    3. From
    4. Salesordheaderdemo as sh
    5. JOIN
    6. Salesorddetaildemo as SD
    7. On
    8. Sh. Salesorderid=sd. SalesOrderID
    9. GO

Merge Join: Used to process indexed data, which is lighter than hash join. We index the associated columns of the previous two tables, and then again the above query, the execution plan changes to the merge Join

[SQL]
    1. CREATE UNIQUE CLUSTERED INDEX idx_salesorderheaderdemo_salesorderid on salesordheaderdemo (SalesOrderID) /c4>
    2. GO
    3. CREATE UNIQUE CLUSTERED INDEX idx_salesdetail_salesorderlid on salesorddetaildemo (SalesOrderID, Salesorderdetailid)
    4. GO


Nested Loop Join: On the basis of meeting the merge join, if there is less data on one side, SQL Server takes the one with less data as an outer loop and the other as an internal loop to complete the join process. Continuing with the previous example, adding a where statement to the query statement to reduce the amount of data on the Join side, the execution plan appears as a nested Loop Join.

[SQL]
    1. SELECT
    2. sh.*
    3. From
    4. Salesordheaderdemo as sh
    5. JOIN
    6. Salesorddetaildemo as SD
    7. On
    8. Sh. Salesorderid=sd. SalesOrderID
    9. WHERE
    10. Sh. salesorderid=43659

Improvements in the execution plan (Table/index scan)

On many occasions we need to extract a small amount of data from a table that contains a lot of data, so scan should be avoided because scan processing will traverse each line, which is time consuming. Let's look at an example:

[SQL]
    1. SELECT
    2. Sh. SalesOrderID
    3. From
    4. Salesordheaderdemo as sh
    5. JOIN
    6. Salesorddetaildemo as SD
    7. On
    8. Sh. Salesorderid=sd. SalesOrderID
    9. WHERE
    10. Sh. orderdate=' 2005-07-01 00:00:00.000 '
    11. GO


The red circle in the diagram marks the table scan, and the execution plan is intelligently recommended for indexing. Let's first try to build an index on the Salesordheader table:

[SQL]
    1. CREATE UNIQUE CLUSTERED INDEX idx_salesorderheaderdemo_salesorderid on salesordheaderdemo (SalesOrderID) /c5>
    2. GO


Then execute the same query statement again, and the execution plan becomes the following:

Table Scan changes to index scan and continues to index another table:

[SQL]
    1. CREATE UNIQUE CLUSTERED INDEX idx_salesdetail_salesorderlid on salesorddetaildemo (SalesOrderID, Salesorderdetailid)
    2. GO


The following changes have occurred in the execution plan:

While it is not possible to say that Scan is worse than seek, seek is a better choice for most occasions (especially when finding small amounts of data in many data). For example, if you have a table of billions of data and you want to take 100 of them, you should make sure that you use Seek, but if you need to take out most of the data (say 95%), Scan might be better. (a more authoritative article gives the threshold value of 30%, which is more efficient when fetching more than 30% data, whereas Seek is better)

In addition, you may notice that both tables are indexed but one table is represented as Clustered Index scan in the execution plan, and the other is Clustered index seek, and we are not looking for two Clustered index seek? This is because the previous table does not have an assertion (predicate), and the latter table asserts the SalesOrderID with the on keyword.

Key Lookup in the execution plan

For the following example, we first set up two different indexes on the same table:

[SQL]
    1. CREATE UNIQUE CLUSTERED INDEX idx_salesdetail_salesorderlid on salesorddetaildemo (SalesOrderID, Salesorderdetailid)
    2. GO
    3. CREATE nonclustered INDEX idx_non_clust_salesorddetaildemo_modifieddate on salesorddetaildemo (ModifiedDate)
    4. GO


Execute the following query:

[SQL]
    1. SELECT
    2. ModifiedDate
    3. From Salesorddetaildemo
    4. WHERE modifieddate=' 2005-07-01 00:00:00.000 '
    5. GO

Execution plan For example, he used the non-clustered index, which we previously established on the ModifiedDate field, to become an index Seek treatment.

Let's change the query statement by adding two additional fields to the SELECT:

[SQL]
    1. SELECT
    2. ModifiedDate,
    3. SalesOrderID,
    4. Salesorderdetailid
    5. From Salesorddetaildemo
    6. WHERE modifieddate=' 2005-07-01 00:00:00.000 '
    7. GO

The execution plan, like, basically unchanged:

The above selected field is not belong to non-clustered index is Clustered index, if you add a few other fields?

[SQL]
    1. SELECT
    2. ModifiedDate,
    3. SalesOrderID,
    4. Salesorderdetailid,
    5. ProductID,
    6. UnitPrice
    7. From Salesorddetaildemo
    8. WHERE modifieddate=' 2005-07-01 00:00:00.000 '
    9. GO

Baby, execute the plan a little more. Two processing (Key Lookup, Nested Loop):


Key lookup is a heavy processing, and we can avoid key lookup by using the keyword with to specify the use of Clustered Index.

[SQL]
    1. SELECT
    2. ModifiedDate,
    3. SalesOrderID,
    4. Salesorderdetailid,
    5. ProductID,
    6. UnitPrice
    7. From Salesorddetaildemo with (index=idx_salesdetail_salesorderlid)
    8. WHERE modifieddate=' 2005-07-01 00:00:00.000 '
    9. GO

The execution plan turns into a Clustered Index Scan:

Previously mentioned Scan does not seem to be a good deal, then the dwarf in a higher, using SET STATISTICS IO on to compare:

[SQL]
  1. SET STATISTICS IO on
  2. GO
  3. SELECT
  4. ModifiedDate,
  5. SalesOrderID,
  6. Salesorderdetailid,
  7. ProductID,
  8. UnitPrice
  9. From Salesorddetaildemo
  10. WHERE modifieddate=' 2005-07-01 00:00:00.000 '
  11. GO
  12. SELECT
  13. ModifiedDate,
  14. SalesOrderID,
  15. Salesorderdetailid,
  16. ProductID,
  17. UnitPrice
  18. From Salesorddetaildemo with (index=idx_salesdetail_salesorderlid)
  19. WHERE modifieddate=' 2005-07-01 00:00:00.000 '
  20. GO
  21. SELECT
  22. ModifiedDate,
  23. SalesOrderID,
  24. Salesorderdetailid,
  25. ProductID,
  26. UnitPrice
  27. From Salesorddetaildemo with (index=idx_non_clust_salesorddetaildemo_modifieddate)
  28. WHERE modifieddate=' 2005-07-01 00:00:00.000 '
  29. GO


Compared with the clustered index query performance is the worst, and the SET STATISTICS IO output Data clustered index query on logical reads spent more time.

Looks like the non-clustered index + Key lookup execution plan is good, but if you can avoid Key lookup is perfect, let's revise non-clustered index to include it in the index with the Include keyword His fields are:

[SQL]
  1. DROP INDEX idx_non_clust_salesorddetaildemo_modifieddate on salesorddetaildemo
  2. GO
  3. CREATE nonclustered INDEX idx_non_clust_salesorddetaildemo_modifieddate on salesorddetaildemo (ModifiedDate)
  4. INCLUDE
  5. (
  6. ProductID,
  7. UnitPrice
  8. )
  9. GO
  10. --Clear the cache, only for the development environment!
  11. DBCC Freeproccache
  12. DBCC dropcleanbuffers
  13. GO

Execute the previous query again:

[SQL]
    1. SELECT
    2. ModifiedDate,
    3. SalesOrderID,
    4. Salesorderdetailid,
    5. ProductID,
    6. UnitPrice
    7. From Salesorddetaildemo
    8. WHERE modifieddate=' 2005-07-01 00:00:00.000 '
    9. GO

This is perfect because our query fields are included in the index, so the execution plan is eventually optimized to index Seek.

SQL Server Performance Tuning Execution plan (execution plan) tuning

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.