SQL Server query execution plan analysis

Source: Internet
Author: User
Tags sql server query management studio

To analyze the performance of a query, you can view {
Tagshow (event, '% D6 % B4 % D0 % D0 % BC % C6 % BB % AE ');
} "Href =" javascript:; "target =" _ Self ">Execution Plan. Execution Plan description {
Tagshow (event, 'SQL ');
} "Href =" javascript:; "target =" _ Self ">SQL{
Tagshow (event, 'server ');
} "Href =" javascript:; "target =" _ Self ">ServerHow the query optimizer runs (or how it runs) a specific query.

There are several different ways to view the queried execution plan. They include:

The SQL Server Query analyzer has an option called "display actual execution plan" (in the "query" drop-down menu ). If this option is enabled, a query execution plan (in graphic format) displayed in a separate window will be obtained whenever a query is run in the query analyzer ).

If you only want to view the execution plan and do not want to run the query, you can select the "show estimated execution plan" option (in the "query" drop-down menu ). When this option is selected, the execution plan is immediately displayed (in graphic format ). The difference between the two is that when a query is run, the calculation on the current server will also be taken into account. In most cases, the execution plans produced by the two methods produce similar results.

When an SQL Server Profiler tracing is created, one event that can be collected is MISC: Execution Plan. This information (in the form of text) shows the query optimizer used to execute the query plan.

You can run the set showplan_text on command on the query analyzer. After this command is executed, all queries executed in the current query analyzer session will not run, but a text-based execution plan will be displayed. When executing a query that uses a temporary table, you must run the set statistics profile on statement before executing the query.

Among the above options, I prefer to use the "show actual execution plan" option. It outputs Information graphically and takes into account the operations on the current server. [7.0, 2000] updated 8-5-2005

*****

If you see any of the following items in the execution plan, you should treat them as warning signals and investigate them to identify potential performance problems. In terms of performance, each item shown below is not ideal.

Index or table scans (index or table scan): it may mean better or additional indexes are required.

Bookmark lookups: you can modify the current clustered index and overwrite the index to limit the number of fields in the SELECT statement.

Filter: Remove any function used in the WHERE clause. do not include a view in an SQL statement. Additional indexes may be required.

Sort: Do data really need to be sorted? Can indexes be used to avoid sorting? Will sorting on the client be more efficient?

Without exception, it is impossible to avoid these operations, but the more you avoid, the faster the query performance. [7.0,]

*****

If there is a stored procedure or another T-SQL batch {
Tagshow (event, '% B4 % fa % C2 % EB ');
} "Href =" javascript:; "target =" _ Self ">CodeWhen a temporary table is used, you cannot use the "show estimated execution plan" option in the query analyzer or management studio to evaluate the query. You must actually run the stored procedure or batch processing code. This is because when a query is run using the "show estimated execution plan" option, it is not actually run, and the temporary table is not created. Because the temporary table is not created, the code for referring to the temporary table will fail, and the estimated execution plan cannot be created successfully.

On the other hand, if you are using Table variables rather than temporary tables, you can use the "show estimated execution plan" option. [7.0,] updated 8-5-2005

*****

If you analyze an execution plan for a very complex query in the query analyzer or management studio, it may be difficult to understand or analyze the execution plan. Therefore, it is much easier to split it into several parts according to the query logic and analyze these parts separately. [7.0,] updated 8-5-2005

*****

Graphical execution plans are not always easy to understand and explain. Remember the following points when viewing the execution plan:

Very complex execution plans are divided into multiple parts which are listed on the screen separately. Each part represents a single process or step that the query optimizer must perform to get the final result.

Each step of the execution plan is often split into smaller sub-steps. Unfortunately, they are displayed on the screen from right to left. This means you have to scroll to the rightmost side of the graphic execution plan to see where each step starts.

Each step is connected to the sub-step by an arrow to display the query execution path.

Finally, all the queried parts are summarized on the left at the top of the screen.

If you move the cursor over any step or sub-step of the Execution Plan, a pop-up window is displayed, showing more detailed information about the step or sub-step.

If you move the cursor over the arrow of the connection step or sub-step, you can see a pop-up window showing how many records are moved from one step or sub-step to another or sub-step.

[7.0, 2000,200 5] updated 8-5-2005

The arrow width of each icon connected to the graph execution plan is different. The arrow width indicates the number of data rows to move between each icon and the relative cost required to move the data row size. The thicker the arrow, the higher the relative cost.

You can use this indicator to quickly measure a query. You may pay special attention to the rough arrow to learn how it affects the query efficiency. For example, the line header should be on the right side of the graphic execution plan, not the left side. If you see them on the left, it means that too many data rows are returned, and this execution plan is not the best execution plan. [7.0,]

*****

Each part of the Execution Plan is allocated a cost percentage. It indicates the amount of resources consumed by the execution plan. When analyzing an execution plan, you should focus on those parts with a high cost percentage. In this way, you can find the most likely problem in a limited time, and return your {
Tagshow (event, '% Cd % B6 % D7 % ca ');
} "Href =" javascript:; "target =" _ Self ">Investment. [7.0, 2000,200 5]

*****

You may notice that some parts of an execution plan have been executed more than once. As part of execution plan analysis, you should focus some of your time on any part that has been executed more than once to see if there is any way to reduce the number of times they are executed. The smaller the number of executions, the faster the query speed. [7.0, 2000,200 5]

*****

In the execution plan, you can see the I/O and CPU costs. They do not have the "actual" meaning, for example, the usage of specific resources. The query optimizer uses these numbers to make the best choice. They can be used as a reference. Smaller I/O or higher CPU costs use less server resources. [7.0, 2000,200 5]

*****

When viewing the SQL Server graphical execution plan, one of the most useful things you can find is how the query optimizer uses indexes for a given query to retrieve data from the table. By checking whether the index is useful and how the index is used, you can determine whether the current index makes the query run as fast as possible.

Move the cursor over the table name (and its icon) in the graph execution plan, and a window is displayed, showing some information. This information allows you to know whether indexes are used to obtain data from tables and how they are used. The information includes:

· Table scan: if this information is displayed, it indicates that there is no clustered index on the data table, or the query optimizer does not use the index for searching. That is, each row of the data table is checked. If the data table is relatively small, table scanning can be very fast, sometimes even faster than using indexes.

Therefore, when a table scan is performed, the first thing to do is to check the number of data rows in the data table. If not, table scanning may provide the best overall performance. However, if the data table is large, it is very likely that the table scan will take a long time to complete, and the query efficiency will be greatly affected. In this case, you need to carefully study and add an appropriate index for the data table for this query.

Suppose you find that a query uses a table scan and has a suitable non-clustered index, but it is not used. What does this mean? Why is this index useless? If the data size to be obtained is very large relative to the data table size, or the data selectivity is not high (meaning there are many duplicate values in the same field), table scanning is often faster than index scanning. For example, if a data table has 10000 data rows and the query returns 1000 rows, if the table has no clustered index, the table scan is faster than using a non-clustered index. Or if the data table has 10000 data rows and the same field (where condition is useful to this field) has 1000 million duplicate data records, the table scan will be faster than using non-clustered indexes.

When viewing the pop-up window on the data table on the graph execution plan, note that "the estimated number of data rows (estimated row count )". This number is the best estimate of the number of rows returned by the query optimizer. If a table scan is performed and the "Estimated number of data rows" value is very high, it means a large number of records are returned. The query optimizer considers that executing a table scan is faster than using an available non-clustered index.

· Index seek (index search): Index Search means that the query optimizer uses non-clustered indexes on the data table to search for data. Performance is usually very fast, especially when only a few data rows are returned.

· Clustered index seek (clustered index search): this means that the query optimizer uses clustered indexes on the data table to search for data, and the performance is very fast. In fact, this is the fastest index search type that SQL server can do.

· Clustered index scan: clustered index scan is similar to table scan. The difference is that clustered index scan is performed on a data table with clustered index. Like normal table scans, clustered index scans may indicate performance problems. In general, there are two reasons for this clustered index scan. The first reason is that, compared with the total number of data rows on the data table, too many data rows may need to be obtained. Check "Estimated number of data rows (estimated row count)" to verify this. The second reason may be that the fields used in the WHERE clause are not highly selective. In any case, unlike standard table scans, clustered index scans do not always search for all data in a data table. Therefore, clustered index scans are generally faster than standard table scans. Generally, you need to change clustered index scanning to clustered index search. The only thing you can do is to rewrite the query statement to limit the statement and return less data rows.

[7.0, 2000,200 5]

In most cases, the query optimizer analyzes connections and uses the most efficient connection type to connect data tables in the most efficient order. But not always. In the graphic execution plan, you can see icons that represent different connection types used in the query. In addition, each connection icon has two arrows pointing to it. The arrow above the connection icon indicates the External table of the connection, and the Arrow below indicates the internal table of the connection. The other end of the arrow points to the name of the connected data table.

Sometimes in Multi-table join queries, the other end of the arrow points to not one data table, but another connection. If you move the cursor over the Arrow between the external connection and the internal connection, you can see a pop-up window that shows you how many data rows are sent to this connection for processing. The External table should always contain fewer data rows than the internal table. If not, the connection sequence selected by the Query Optimizer may be incorrect (the following is more information about this topic ).

First, let's look at the connection type. SQL Server can use three different {
Tagshow (event, '% BC % Ca % f5 ');
} "Href =" javascript:; "target =" _ Self ">TechnologyTo join a data table: nested loop, hash, and merge ). In general, nested loops are the fastest connection type, but if nested loops cannot be used, hash or merge will be used as the appropriate connection type. Both are slower than nested loop connections.

When a large table is connected, the merge join operation may be the best option, rather than nested loop join. The only way to clarify this is {
Tagshow (event, '% B2 % E2 % Ca % D4 ');
} "Href =" javascript:; "target =" _ Self ">TestTo see which one is most efficient.

If you suspect that a slow query speed may be caused by the unsatisfactory connection type, you can use the connection prompt to overwrite the selection of the query optimizer. Before using the connection prompt, You need to spend some time learning about each connection type and how they work. This is a complex topic that is beyond the scope of this article.

The query optimizer selects the most efficient connection type to connect to the data table. For example, the External table connected by nested loops should be the smaller table in the two joined tables. The same is true for hash join. Its External table should be smaller. If you think the connection sequence selected by the query optimizer is incorrect, you can use the connection prompt to overwrite it.

In many cases, the only way to confirm whether to use the connection prompt to change the connection type or order is to improve or reduce the efficiency is to test them to see what happened. [7.0, 2000,200 5]

*****

If your SQL Server has multiple CPUs and the default settings of SQL Server are not modified to limit the SQL server's ability to use all the CPUs on the server, the query optimizer will consider using parallelism to execute some queries. Parallel Processing refers to the ability to run one query simultaneously on multiple CPUs. In many cases, a query running on multiple processors is faster than a query running only on a single processor, but not always.

The query optimizer does not always use parallel processing, even when it can be used. This is because the query optimizer takes into account various factors before deciding to use parallel processing. For example, the number of active connections on the current SQL Server, CPU busy, whether there is enough memory to run the parallel query, the number of data rows to be processed, and the type of the query. After the query optimizer collects the real data, it determines whether parallel processing is the best choice to run the query. You may find that parallel processing is not used for a query, but parallel processing is used when the same query is run again later.

Sometimes, the cost of using multiple processors is greater than the resources that can be saved by using them. Although the query processor does measure the positive and negative effects of using parallel queries, its conjecture is not always correct.

If you suspect that parallel processing is compromising the performance of a query, you can use the option (maxdop 1) prompt to disable parallel processing of the query.

The only way to determine whether to use parallel processing is to test the query in these two ways to see what happened. [7.0, 2000,200 5]

*****
When you view the graph execution plan, you may find that the text of an icon is displayed in red rather than black in general. This means that the statistical data of related tables is lost. The statistical data is necessary for the query optimizer to generate a good execution plan.

You can right-click this icon and select "create missing statistics" to create the missing statistics. The "create lost statistical data" dialog box is displayed, through which you can easily create lost statistical data.

This should always be done when you can choose to update the lost statistics, because it is very likely to benefit the efficiency of the query statements you are analyzing. [7.0, 2000,200 5]

*****

Sometimes you will see the "assert" icon on the graphic execution plan. This means that the query optimizer is verifying whether the query statement violates the integrity or condition constraints of the reference. If no, there is no problem. However, the query optimizer will not be able to create an execution plan for the query and will generate an error. [7.0, 2000,200 5]

*****

You will often see the icon marked as "bookmark lookup" on the graphic execution plan. Searching bookmarks is quite common. The essence of bookmarksearch is to tell you that the query processor must search for the required data rows from a data table or clustered index, rather than directly reading from non-clustered indexes.

For example, if all the fields in the select, join, and where clauses of a query statement do not exist in the non-clustered index used to locate the data row that meets the query conditions, then the query optimizer has to do additional work to find the fields that meet the query statement in the data table or clustered index.

Another reason for bookmarked search is that select * is used *. In most cases, it will return more data than you actually need, so you should never use select *.

In terms of performance, it is not ideal to search for bookmarks. Because it will request additional I/O overhead to search in the field to return the required data rows.

If you think that the bookmarked search does not affect the query performance, there are four options to avoid it: you can create a clustered index that will be used by the WHERE clause and take advantage of the index intersection, create a covered non-clustered index, or (if it is SQL Server 2000/2005 Enterprise Edition) Create an index view. If none of these are possible, or any of them will consume more resources than the bookmarks, then the best choice is to search for bookmarks. [7.0, 2000,200 5]

Sometimes the query optimizer needs to create a temporary worksheet in the tempdb database. If this is the case, it means that there are icons marked as index spool, row count spool or table spool in the graph execution plan.

Using a worksheet at any time may affect performance, because additional I/O overhead is required to maintain the worksheet. Ideally, do not use worksheets. Unfortunately, you cannot always avoid using worksheets. Sometimes, when using a worksheet is more efficient than other options, it actually improves performance.

Regardless of the situation, worksheets in the graphic execution plan should alert you. You should carefully check such query statements to see if you can rewrite the query to avoid using worksheets. There may be no way. However, if there is one, you will step forward in improving the query performance. [7.0, 2000,200 5].

*****
The stream aggregate icon is displayed in the graphic execution plan, which means that a single input is aggregated. When a distinct clause or any aggregate function is used, such as AVG, Count, Max, Min, or sum, stream aggregation is quite common. [7.0, 2000,200 5]

*****
The query analyzer and Management studio are not the only tools that can generate and display query execution plans. SQL Server Profiler can also display execution plans in text format. One advantage of using SQL Server Profiler to display an execution plan is that it can generate an execution plan for a large number of actually running queries. If query analyzer and Management studio are used, only one query analyzer can be run at a time.

When using profiler to capture and display execution plans, you must use the following configuration to generate a trail:

 

Capture events

· Performance: Execution Plan

· Performance: Show plan all

· Performance: Show plan statistics

· Performance: Show plan text

 

Field displayed

· Starttime

· Duration

· Textdata

· CPU

· Reads

· Writes

 

Filter Condition

· Duration. You will want to specify the maximum query execution time, for example, 5 seconds, so as to avoid getting too much data.

Of course, you can capture more information without examples in your tracing. The above example is just a guide. But remember not to capture too much data. Otherwise, tracing may affect the server performance. [7.0, 2000,200 5]

*****
If the option fast prompt is used in the query, you must be careful that the execution plan result may not be what you expected. In this case, the execution plan you see is based on the results with the fast prompt, rather than the actual execution plan of the entire query statement.

The fast prompt is used to inform the query optimizer to return the data rows of the specified number of rows as quickly as possible, even if this will prevent the overall performance of the query. The purpose of this prompt is to quickly return records of a specific number of rows, which leads to the illusion that the speed is very fast .. When the data rows of the specified number of rows are returned, the remaining data rows are returned at their normal speed.

Therefore, if the fast prompt is used, the generated execution plan is only based on the data rows returned by the fast, instead of all data rows to be returned by the query. If you want to view the execution plan of all data rows, you must remove this fast prompt. [2000,2005]

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.