MSSQLSERVER Implementation plan detailed

Last Update:2015-11-30 Source: Internet

Author: User

Tags mssqlserver

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Preface

This article has two main purposes:

1, read the T-SQL implementation plan, understand the implementation of some common sense in the plan.

2, be able to analyze the execution plan, find the idea or solution of optimizing SQL performance.

If your understanding of SQL query optimization or common sense is not very deep, then recommend a few cheat blog to you: SQLserver performance detection and Optimization tool use verbose ,SQL statement optimization analysis ,T-SQL statement Query Execution order .

Introduction to the execution plan

1. What is the execution plan?

eldest brother submitted SQL statements, database query optimizer, after analysis to generate multiple databases can be recognized by the efficient execution of queries. The optimizer will then find a number of execution plans with the least amount of resources, rather than the fastest execution, to show you, either in XML format, in text format, or as a graphical execution plan.

2. Estimated execution plan, actual execution plan

Select the statement, click on one of the execution plans above, estimate the execution plan can be displayed immediately, and the actual execution plan will need to execute the SQL statement after the appearance. The estimated execution plan is not equal to the actual execution plan, but in most cases the actual execution plan is consistent with the projected execution plan. Changes in statistical information or execution plan recompilation can be different.

3, why to read the implementation plan

The first execution plan lets you know how your complex SQL is executed, whether it is executed according to the plan you want, whether it is executed in the most efficient way, which one of the many indexes, how to sort, how to merge the data, and there is no unnecessary waste of resources and so on. Official data shows that there are problems with executing T-SQL, and 80% can find answers in the execution plan.

4. Analysis of graphical execution plan

The execution plan can be displayed in text, XML, and graphically. This cheat is mainly based on the graphical execution plan leading analysis, however, the execution plan contains 78 available operators, this article can only be used for analysis, commonly used almost contains your daily all. There are pictures on MSDN:https://msdn.microsoft.com/zh-cn/library/ms175913 (v=sql.90). aspx

5, how to see the implementation plan

The graphical execution plan is viewed from top to bottom and from left to right.

6. Clear the cached execution plan

DBCC FREEPROCACHE

DBCC FLUSHPROCINDB (DB_ID)

read the graphical execution plan

1. Connection

1, the thicker the scan affects the number of rows.

2. Actual number of rows in the rows scan that are actually affected.

3. Estimated number of rows estimates how many lines the scan affects.

4. The estimated size (in bytes) of the rows generated by the estimated row size operator.

5. Estimated data size estimates the magnitude of the impact.

2, ToolTips, current step execution information

Note: The information in this tips tells us what the object is, what the action is, what the data is, what the index is, whether it is sorted, estimated CPU, I/O, the number of rows affected, and the actual number of rows. For a list of specific parameters, see MSDN:https://msdn.microsoft.com/zh-cn/library/ms178071 (v=sql.90). aspx

3. Table Scan

This action occurs when there is no clustered index in the table, and there is no appropriate index. This operation is very performance-intensive, and his presence also means that the optimizer will traverse the entire table to find the data you need.

4, Clustered Index Scan (clustered index Scan), index Scan (non-clustered index Scan)

This icon can be used for two operations, a clustered index scan, and a non-clustered index scan.

Clustered Index Scan: The data volume of a clustered index is actually the table itself, that is, how many rows of tables there are, how many rows are clustered, then the clustered index scan is almost the same as the table scan, and a full table scan, traversing all the table data, finds the data you want.

Nonclustered index scanning: the volume of a nonclustered index is based on your index creation and can contain only the columns you want to query. Then a nonclustered index scan is all the rows of the columns contained in your nonclustered, to find out the data you want.

5, key lookup (key value lookup)

The first thing to say is to find, find and scan in performance is not a level, scan need to traverse the entire table, and find only need to extract data directly from the key value, return results, performance is better.

When you are looking for a column that is not completely contained by a nonclustered index, you need to use a key-value lookup to find columns on the clustered index that are not included in the nonclustered index.

6. Rid Lookoup (RID lookup)

Similar to the key-value lookup, except that the RID lookup is not fully contained by a nonclustered index, and the remaining columns are in a table that does not have a clustered index, does not have a key value lookup, and can only query data based on rows representing RIDs.

7. Clustered Index Seek (clustered index lookup), index Seek (nonclustered index lookup)

Both clustered index lookups and nonclustered index lookups use this icon.

Clustered index Lookup: The clustered index contains data for the entire table, that is, the data on the clustered index based on the key value.

Nonclustered index Lookups: Nonclustered indexes contain the data from the columns that were created at the index, and data on those nonclustered indexes based on the key values.

8. Hash Match

This icon is used in two places, one is the table association, and the other is the data aggregation operation.

Before I say the first of these two operations, let me say hashing (coding technology) and hash Table (data structure).

Hashing: In the database according to the data content of each row, converted into a unique symbol format, stored in the temporary hash table, when the original data needs to be restored back. Like encryption and decryption technology, but he can more effectively support the data query.

Hash table: The data is stored in a table in key/value form by hashing processing, and in the database he is placed in tempdb.

Next, let's say that the hash math table is associated with how the row data aggregation operates.

Table Associations:

For example, when associating two datasets, Hash match will place the smaller data set into the Hashtable with the hashing operation, and then a row-by-row traversal of the larger dataset matches the Hashtable with the corresponding pull data.

Data aggregation: When Count/sum/avg/max/min is required in a query, the data may be calculated by placing the data in the Hashtable in memory first.

9, Nested Loops

This operation symbol summarizes the datasets of two different columns into a single table. The output list in the hint message has two datasets, and the following data set (inner set) is scanned with the above data set (out set), so that the operation is complete until the scan is finished.

10. Merge Join

This association algorithm merges two already sequenced collections. If the two aggregations are unordered, then the set will be sorted and then one by one merged, because it is an ordered set, the top-down merge efficiency of the left and right two sets is quite fast.

11. Sort (sorted)

To sort the data collection, it is important to note that some data sets are self-ordered after the index scan.

12. Filter (filtering)

Filter by operator that appears after having the

13. Computer Scalar

Custom columns are required in the columns that need to be queried, such as COUNT (*) as CNT, select Name+ ' +age, and so on.

optimized operation According to the details of the execution plan

There will be a lot of suggestions to give, I do not give an example, given a few examples, want to optimize the experts, more need to understand the understanding.

1. If select * Typically, clustered indexes are better than nonclustered indexes.

2. If nested Loops is present, you need to check whether the clustered index is required, and whether the nonclustered index can contain all the required columns.

3, Hash match connection operation is more suitable for the need to do hashing algorithm set a small connection.

4, the Merge join need to check whether the original collection is already sorted, if there is no sort, the use of the index can be resolved.

5, a table scan occurs, clustered index Scan, nonclustered index scan, consider whether the statement can be added where restrictions, select * can remove unnecessary columns.

6. If the RID lookup occurs, can the index optimization be resolved.

7. When you see an index that is not what you want in the plan, see if you can force the index to be used to resolve the problem in the statement by using the index option select Cluname1,cluname2 from Table with (Index=indexname).

8. When you see a connection algorithm that is not what you want, try to force the problem to be solved using the algorithm you want. Statement forcing the use of the join algorithm: SELECT * from T1 LEFT join T2 on t1.id=t2.id option (Hash/loop/merge join)

9. See that the aggregation algorithm is not what you want to be, try forcing the aggregation algorithm you want to use. Example of a statement that enforces the use of an aggregation algorithm: Select Age, Count (age) as CNT from T1 GROUP BY-age option (Order/hash Group)

10. When you see that the parsing order is not what you want, or if this parsing sequence takes too much time, try forcing your order of execution. Option (Force order)

11. When you see that there are multiple threads merging to execute your SQL statement and affect performance, try forcing is not parallel operation. Option (MAXDOP 1)

12, in the stored procedure, due to different parameters cause the execution plan is different, also affect performance when attempting to specify parameters to optimize. Option (optiomize for (@name = ' zlh '))

13, do not operate redundant columns, redundant rows, do not do the necessary aggregation, sorting.

Summary

Already to 2:30 in the evening, query optimization did not do too detailed decomposition, forgive me, if you see this article after what questions, welcome to join the blog left Upper Corner group, Exchange Learning, Ann.

MSSQLSERVER Implementation plan detailed

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More