When you need to analyze the performance of a query, one of the best ways to see the execution plan for that query. The execution plan describes how the SQL Server query optimizer actually runs (or how it will run) a particular query.
There are several different ways to view the execution plan for a query. They include:
There is an option in SQL Server Query Analyzer called "Show Actual Execution Plan" (located in the Query drop-down menu). If this option is turned on, whenever you run a query in Query Analyzer, you will get a query execution plan (in graphical format) that appears in a separate window.
If you just want to see the execution plan and don't want to run the query, you can choose the show estimated Execution plan option (located in the Query drop-down menu). When this option is selected, the execution plan is immediately displayed (in graphical format). The difference between the two is that when a query is actually run, the operations on the current server are also taken into account. In most cases, the results of the execution plan produced by both methods are similar.
When establishing a SQL Server Profiler trace, an event that can be collected is the Misc:execution plan. This information (in the form of text) shows the calculated row schedule that the query optimizer uses to execute the query.
You can execute the SET SHOWPLAN_TEXT on command on the Query Analyzer. When this command is executed, all queries executed in the current Query Analyzer session will not run, but will display a text-based execution plan. When you execute a query that uses a temporary table, you must run the Set STATISTICS profile on statement before executing the query.
Among these options, I prefer to use the show actual execution plan option. It outputs information graphically and takes into account those operations on the current server. [7.0, 2000] Updated 8-5-2005
*****
If you see any of the following in the execution plan, you should treat them as warning signals and investigate them to identify potential performance issues. In terms of performance, each of the items shown below is not ideal.
Index or table scans ( indexed or sheet scan ) : may mean a better or extra index is needed.
Bookmark Lookups ( Bookmark Lookup ) : Consider modifying the current clustered index, using the cover index, to limit the number of fields in the SELECT statement.
Filter ( filtering ) : Remove any functions used in the WHERE clause, do not include views in SQL statements, and may require additional indexes.
Sort ( sort ) : Does the data really need to be sorted? Can I use an index to avoid sorting? Will the client sort be more efficient?
It is not possible to avoid these operations without exception, but the more you avoid them, the faster your query performance will be. [7.0,2000,2005]
*****
If you have temporary tables in a stored procedure or in other T-SQL batch code, you cannot use the show Estimated Execution plan option in Query Analyzer or Management Studio to evaluate the query. This stored procedure or batch code must be actually run. This is because when you run a query by using the show estimated Execution plan option, it is not actually running and the staging table is not created. Because the temporary table is not created, the code referencing the temporary table fails, resulting in an estimated execution plan that cannot be created successfully.
On the other hand, if you are using a table variable instead of a temporary table, you can use the show Estimated execution plan option. [7.0,2000,2005] Updated 8-5-2005
*****
If you analyze an execution plan for a very complex query in Query Analyzer or Management Studio, you may find its execution plan difficult to read and difficult to analyze. Then it is much easier to split it into parts according to the logic of the query, and then analyze the parts separately. [7.0,2000,2005] Updated 8-5-2005
*****
Graphical execution plans are not always easy to read and interpret. Here are some points to keep in mind when reviewing your execution plan:
Very complex execution plans are divided into sections that are listed on the screen separately. Each section represents a single process or step that the query optimizer must perform in order to get the final result.
Each step of the execution plan is often split into smaller sub-steps. Unfortunately, they are displayed on the screen from right to left. This means that you have to scroll to the far right of the graph execution plan to see where each step starts.
Each step is connected by an arrow between the steps to show the path to the query execution.
Finally, all parts of the query are aggregated to the left of the top of the screen.
If you move the mouse over any of the steps or sub-steps of any execution plan, a pop-up window appears with more detailed information about the step or sub-step.
If you move the mouse over the arrow of a connection step or a child step, you see a pop-up window that shows how many records are moved from one step or child step to another step or sub-step.
[7.0, 2000, 2005] Updated 8-5-2005
The arrow weights for each icon on the graphics execution plan are different. the weight of the arrows represents the amount of data rows moved between each icon and the relative cost of moving the data row size. the thicker the arrows, the higher the relative cost.
You can use this indicator to quickly measure a query. You might pay special attention to the thick arrows to see how it affects the performance of the query. For example, the coarse thread should be on the right side of the graphics execution plan, not the left. If you see them on the left, it means that too many rows of data are returned, and that the execution plan is not the best execution plan. [7.0,2000,2005]
*****
each part of the execution plan is assigned a cost percentage. it indicates how much of this part consumes the resources of the entire execution plan. When analyzing an execution plan, you should focus on those parts that have a high cost percentage. This allows you to find the most probable problem in a limited amount of time, thus repaying your investment in time. [7.0, 2000, 2005]
*****
you may notice that some parts of an execution plan have been executed more than once. as part of the execution plan analysis, you should focus some of your time on any parts that have been executed more than once to see if there is any way to reduce the number of times they are executed. The fewer times you execute, the faster your queries will become. [7.0, 2000, 2005]
*****
You can see the I/O and CPU costs in the execution plan. They have no "real" meaning, such as the amount of usage that represents a particular resource. The query optimizer uses these numbers to make the best choice. One implication of what they can refer to is that smaller I/O or CPU costs use less server resources than large I/O or CPU costs. [7.0, 2000, 2005]
*****
One of the most useful things you can look for when looking at a SQL Server graphical execution plan is how the query optimizer uses indexes for a given query to fetch data from a table. by looking at whether the index is useful, and how the index is used, it helps to determine whether the current index makes the query execute as fast as possible.
Move the mouse over the table name (and its icon) on the graphical execution plan, and a window pops up to see some information. This information lets you know whether the index is useful to get the data from the table and how it is used. This information includes:
- table Scan: If you see this information, there is no clustered index on the datasheet, or the query optimizer does not use an index to find it. Each row of the data table is checked. If the table is relatively small, the table scan can be very fast and sometimes even faster than using the index.
So, when you see an execution table scan, the first thing to do is to see how many rows of data the data table has. If not too many, then the table scan may provide the best overall performance. However, if the data table is large, the table scan is likely to take a long time to complete, query performance is greatly affected. In this case, a careful study is needed to add an appropriate index to the data table for this query.
Suppose you find that a query uses a table scan and has a suitable nonclustered index, but it is not used. What does that mean? Why is this index not used? Table scans are often faster than index scans if the amount of data that needs to be obtained is very large relative to the size of the data table, or if data selectivity is low (meaning that there are many duplicates in the same field). For example, if a data table has 10,000 rows of data, the query returns 1000 rows, and if the table does not have a clustered index, then the table scan will be faster than using a nonclustered index. Or if the data table has 10,000 rows of data, and there are 1000 duplicate data on the same field (where conditional clauses are useful to this field), the table scan will be faster than using nonclustered indexes.
When viewing pop-up windows on a data table on a graphical execution plan, note the estimated number of rows (estimated row count). This number is the best guess as to how many rows of data the query optimizer has been made to return. If you perform a table scan and the "estimated number of rows of data" value is high, it means that the number of records returned is many, and the query optimizer considers that performing a table scan is faster than using an available nonclustered index.
- Index seek : Index lookup means that the query optimizer uses a nonclustered index on the data table to find the data. Performance is usually very fast, especially when only a handful of data rows are returned.
- Clustered Index Seek ( clustered index lookup ): This means that the query optimizer uses a clustered index on the data table to find the data, and it performs quickly. In fact, this is the fastest type of index lookup that SQL Server can do.
- Clustered Index Scan ( clustered Index Scan ): Clustered index scans are similar to table scans, unlike clustered index scans performed on a data table with a clustered index. As with normal table scans, clustered index scans may indicate a performance issue. In general, there are two reasons for this clustered index scan to be performed. The first reason is that you might need to get too many rows of data relative to the total number of data rows on the datasheet. Review the estimated number of rows of data (estimated row Count) to verify this. The second reason may be because the field used in the WHERE condition clause is not selective. In any case, unlike a standard table scan, a clustered index scan does not always look for all the data in a data table, so a clustered index scan will generally be faster than a standard table scan. In general, to change a clustered index scan to a clustered index lookup, the only thing you can do is rewrite the query statement, making the statement more restrictive and returning fewer rows of data.
[7.0, 2000, 2005]
In most cases, the query optimizer parses the connection and uses the most efficient connection type to connect to the data table in the most efficient order. but that's not always the case. In the graphical execution plan you can see the icons for the various connection types used to represent the query. In addition, each connection icon has two arrows pointing to it. The upper arrow pointing to the connection icon represents the external table for the connection, and the following arrow represents the inner table of the connection. The other end of the arrow points to the concatenated data table name.
Sometimes in a multi-table connection query, the other end of the arrow points to not a data table but to another connection. If you move your mouse over an arrow pointing to an external connection and an internal connection, you can see a pop-up window that tells you how many rows of data are being sent to this connection for processing. External tables should always contain fewer rows of data than internal tables. If not, the query optimizer may choose an incorrect connection order (more information on this topic below).
First, let's look at the connection type. SQL Server can use three different techniques to connect tables: Nested loops (nested loops), hash (hash), and merge. In general, nested loops are the fastest connection type, but if nested loops are not possible, hashing or merging is used as the appropriate connection type. Both are slower than nested loops.
When you connect large tables, the merge connection may be the best option, not a nested loop join. The only way to make this clear is to test both to see which one is most efficient.
If you suspect that a query is slow because it uses a connection type that is not ideal, you can use the connection hint to cover the query optimizer's selection. Before using connection hints, you need to take some time to learn about each type of connection and how they work. This is a complex topic that is beyond the scope of this article.
The query optimizer chooses the most efficient connection type to connect to the data table. For example, the outer table of a nested loop join should be the smaller of the two tables connected. The hash join is the same, and its external table should be the smaller table. If you think the query optimizer chooses a connection order that is wrong, you can use the connection hint to cover it.
In many cases, the only way to make sure that you use connection hints to change the connection type or the connection order is to improve or decrease the performance by testing them to see what happens. [7.0, 2000, 2005]
*****
If your SQL Server has multiple CPUs and does not modify the default settings of SQL Server to limit the ability of SQL Server to use all CPUs on the server, the query optimizer will consider using parallel processing (parallelism) to execute some queries . Parallel processing refers to the ability to run a query concurrently on multiple CPUs. In many cases, a query running on multiple processors is faster than a query running on a single processor, but not always.
The query optimizer does not always use parallel processing, even when it is available. This is because the query optimizer takes into account a variety of factors before deciding to use parallel processing. For example, the number of active connections on the current SQL Server, how busy the CPU is, whether there is enough memory to run the parallel query, the number of rows of data to process, and the type of the query. After the query optimizer collects these real data, it determines whether parallel processing is the best choice to run the query. You may find that one query does not use parallel processing, but when you run the same query again later, parallel processing is used.
Sometimes, the cost of using multiple processors is greater than the amount of resources that can be saved by using them. Although the query processor does measure the effects of the pros and cons of using parallel queries, its conjecture is not always correct.
If parallel processing is suspected to hinder the performance of a query, you can use the option (MAXDOP 1) hint to turn off parallel processing of the query.
The only way to decide whether to use parallel processing is to test the query in both ways to see what happens. [7.0, 2000, 2005]
*****
When you view the graphics execution plan, you may find that the text of an icon is displayed in red instead of the usual black color. This means that some statistics are missing from the related tables, and statistics are necessary for the query optimizer to generate a good execution plan.
The lost statistics can be created by right-clicking on the icon and selecting "Create Missing Statistics". The Create Missing Statistics dialog box pops up, which makes it easy to create lost statistics.
You should always do this when you have the option to update lost statistics, as it is highly likely that you will benefit from the performance benefits of the query statement you are analyzing. [7.0, 2000, 2005]
*****
Sometimes you'll see the logo on the graph execution plan. "Assert" the icon. This means that the query optimizer is verifying that the query statement has referential integrity violations or conditional constraints. If not, then there is no problem. However, if there is one, the query optimizer will not be able to establish an execution plan for the query, and it will produce an error. [7.0, 2000, 2005]
*****
you will often see a graphical execution plan that identifies " Bookmark Lookup (Bookmark Lookup) " the icon. bookmark lookups are fairly common. The essence of bookmark lookup is to tell you that the query processor must find the data rows it needs from the data table or the clustered index, rather than reading directly from the nonclustered index.
For example, if the select,join of a query statement and all the fields in the WHERE clause do not exist in the nonclustered index that is used to locate rows of data that match the query criteria, Then the query optimizer has to do extra work in the data table or in the clustered index to find the fields that satisfy the query statement.
Another reason to cause a bookmark lookup is to use SELECT *. Since in most cases it will return more data than you actually need, you should never use SELECT *.
In terms of performance, bookmark lookups are not ideal. Because it asks for additional I/O overhead, look in the field to return the data rows that you want.
If you think that bookmark lookup is interfering with the performance of your query, there are four choices you can take to avoid it: You can build a clustered index that the WHERE clause will use, take advantage of the index intersection, build an overlay nonclustered index, or (if it is SQL Server 2000/2005 Enterprise Edition) An indexed view can be established. If none of this is possible, or if you use any of them to find more resources than bookmarks, then bookmark Lookup is the best choice. [7.0, 2000, 2005]
sometimes the query optimizer needs to tempdb Create a temporary worksheet in the database. If this is the case, it means that the graphical execution plan has an icon labeled Index Spool, Row Count Spool, or table Spool.
At any time, using a worksheet generally interferes with performance because additional I/O overhead is required to maintain the worksheet. Ideally, you should not use a worksheet. Unfortunately, it is not always possible to avoid using worksheets. Sometimes when working with worksheets is more efficient than other choices, its use actually enhances performance.
In either case, the worksheets in the graphics execution plan should be alert to you. You should double-check such query statements to see if there is a way to rewrite the query to avoid using the worksheet. There may be no way. But if you do, you're going a step further in improving the performance of this query. [7.0, 2000, 2005].
*****
See flow aggregation on a graphical execution plan (Stream Aggregate) The icon means that there is an aggregation of a single input. Stream aggregation operations are quite common when using the DISTINCT clause, or any aggregate function, such as AVG, COUNT, MAX, MIN, or sum. [7.0, 2000, 2005]
*****
Query Analyzer and Management Studio is not the only tool that can generate and display query execution plans. SQL Server Profiler can also display execution plans, but the format is in textual form. One advantage of using SQL Server Profiler to display execution plans is that it can generate execution plans for a large number of queries that actually run. If you use Query Analyzer and Management Studio, you can only run one at a time.
When using Profiler to capture and display execution plans, you must generate a trace using the following configuration:
Capturing events
- Performance:execution Plan
- Performance:show Plan All
- Performance:show Plan Statistics
- Performance:show Plan Text
The fields displayed
- StartTime
- Duration
- TextData
- Cpu
- Reads
- Writes
Filter conditions
- Duration. You will want to specify the maximum query execution time, such as 5 seconds, to avoid getting too much data.
Of course, you can capture more of the information in your tracking without an example, and the example above is just a guide. But you must remember not to capture too much data, or the running of the trace will affect the performance of the server. [7.0, 2000, 2005]
*****
If you are using the OPTION FAST prompt, you must be careful that the results of the plan may not be what you expect. The execution plan you see is based on the result of using the fast prompt instead of the actual execution plan for the entire query statement.
The fast hint is used to tell the optimizer to return data rows of the specified number of rows as soon as possible, even if doing so will prevent the overall performance of the query. The purpose of using this hint is to quickly return a record of a particular number of rows for the user, thus allowing them to produce a very fast illusion of speed. When the data row for the specified number of rows is returned, the remaining data rows are returned at their usual speed.
Therefore, if fast hints are used, the resulting execution plan is based only on the data rows returned by fast, not all rows of data to be returned. If you want to see the execution plan for all data rows, you must remove the fast prompt. [2000,2005]
Understanding SQL Server Execution Plans