Sixth--Optimizing performance according to execution plan (1)--understanding hashing, merging, nested loops join policies

Source: Internet
Author: User
Preface:

This series of articles includes:

1, understand the hash, Merge, Nested Loop Association strategy.

2. Find and resolve table/index scans in the execution plan.

3. Introduce and find keys to find and resolve them in the execution plan.

For performance optimization, the following issues need to be centrally addressed:

1, create a performance baseline for your environment.

2, monitor the current performance and find bottlenecks.

3, solve the bottleneck in order to get better performance.

An estimate execution plan is a blueprint that describes how the query will be executed, and an actual execution plan is the mirror that actually occurs when a query executes. By comparing two execution plans, you can find out whether the query actually executes according to the estimate execution plan.

In the execution plan, there are some very important operators that need to be clear:

1. Join strategy: SQL Server has 3 strategies-hashing, merging, nested loops. Each strategy has its advantages and disadvantages, and this chapter describes this section.

2. Scans and lookups are two ways SQL Server uses to read data, which is the core concept in performance optimization. will be described in the next article.

3, key lookup can sometimes become a major performance problem. The value of a non-key column in a nonclustered index was found because the store caused the need to skip to the clustered index from the nonclustered index. Such behavior is usually time-consuming.

Understanding hashing, merging, nested loop join policies

SQL Server provides a strategy for join in 3, which has no absolute good or bad points.

1, hash (hash Join): SQL Server selects a hash association as a physical operator, while querying for bulk data, unordered, or without indexing. Two processes are associated with hash associations, which are "build" and "probe," in the "build" process, from the established input (that is, in the left table of the join, but it is possible that the left table will swap positions in the optimization process, making it not necessarily the actual left table.) Read all rows, and then create a hash table in memory that matches the associated criteria. In the probe process, all rows are read from the probe table (the input right table) and match the memory hash table created earlier, depending on the associated condition.

2. Merge (merge Join): If the association table is already sorted, SQL Server selects the Merge association. The merge association requires that at least one of the associated conditions is already sorted. This is more efficient than a hash association if the amount of data is small, and it is not the way the heavy load is associated.

3. Nested loops (Nested loop): In at least two result sets, it is more efficient to use nested loops, which are smaller than the collection of external tables, and the internal loop result set has a valid index. This approach does not apply to large result sets. preparatory work:

The following will create two tables, and then look at the execution plans for the various associations:

Use AdventureWorks
go
IF object_id (' Salesordheaderdemo ') are not NULL 
    BEGIN
        DROP TABLE Salesordheaderdemo End Go

IF object_id (' Salesorddetaildemo ') are not NULL 
    BEGIN
        DROP TABLE Salesorddetaildemo End Go

SELECT  *
to    Salesordheaderdemo    from Sales.SalesOrderHeader
Go

SELECT  *
to    Salesorddetaildemo    from Sales.SalesOrderDetail Go



steps:

1. Execute the query and open the execution Plan (CTRL+M):

SELECT  sh.*
from    Salesorddetaildemo as SD
        INNER JOIN Salesordheaderdemo as sh on sh.salesorderid = Sd.salesorderid Go



2, and then from the execution plan screenshot can be seen using a hash connection:


3. Now create a unique clustered index in two tables:

CREATE UNIQUE CLUSTERED INDEX idx_salesorderheaderdemo_salesorderid on Salesordheaderdemo (SalesOrderID) go

CREATE UNIQUE CLUSTERED INDEX idx_salesdetail_salesorderid on Salesorddetaildemo (salesorderid,salesorderdetailid) Go




4. Execute the statement of Step 1 again:

5, the screenshot is the second execution of the execution plan, can be found into a merge connection, and the table scan into a clustered index scan:


6. Now look at the nested Loops Association, add a where condition to the query above to qualify the result set for the search:

SELECT  sh.*
from    Salesorddetaildemo as SD
        INNER JOIN Salesordheaderdemo as sh on sh.salesorderid = Sd.salesorderid
WHERE   sh.salesorderid = 43659
Go



7, from the execution results see now the association into a nested loop:

Analysis:

As mentioned earlier, hash associations work in associations with large amounts of data and where the associated fields are not sorted. So in step 1, the association of the data uses a hash association because there is no index or a sort of advance order.

In step 3, a unique clustered index was created, so the table has been sorted by a clustered index, at which point the optimizer chooses the merge association.

In step 6, a nested loop association was used because the where condition was used to limit the size of the dataset and because it was sorted.

Each association method has its advantages and disadvantages, depending on how it is optimized. Sometimes hash associations have a very important role to play, but if you can, it is strongly recommended that each table should have a unique clustered index, while using merge associations, and if not, never try to change the association to a merge or nested loop using the option prompt, which may degrade performance. Nested loops are best run only in small result sets.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.