Objective
About SQL Server Tuning series is a huge content system, not a word two words can be analyzed clearly, this article first in the SQL tuning in the most commonly used query plan to parse, to do a good job of the basic grasp, tamping basic skills! Then we talk about the overall statement tuning.
Learn how to read and understand query plans, and enumerate the most commonly used query execution operators in this article.
Technical preparation
Based on the SQL SERVER2008R2 version, a more concise case library (Northwind) from Microsoft was used for parsing.
One, the difference of different operators
When all T-SQL statements are executed, the statements are decomposed into basic structural units, collectively known as: Operators. Each operator implements a single basic operation, such as: Table Scan, index lookup, index scan, filtering, and so on. Each operator can iterate iteratively, or it can continue the child operator, so that it can form a query tree, that is: a query plan.
Each T-SQL statement is composed of multiple operators to form different query plans, and these query plans are valid for filtering the results, but when executed, the SQL Server query optimizer automatically finds an optimal one for us.
Each operator will have the output of the incoming and the result data of the source data, the input of the source data can be derived from other operators or read directly from the Data source table, the output of the result through its own operation. So each operator is independent. Indifferent to each other.
The following example
SELECT COUNT (*) from Orders
This statement generates two simple operators
Of course, the above two operators in SQL Server have their own expressions, and Count (*) is a stream aggregation operator.
Each operator has three properties that affect the efficiency of its execution
1. Memory consumption
All operators require a certain amount of fixed memory to complete execution. When a T-SQL statement is compiled to generate a query plan, SQL Server attempts to fix the memory for a query plan that is considered optimal, in order to waste time and speed up execution without having to re-request memory again for execution.
Then, there are operators that require additional memory space to store row data, and the amount of memory required by such operators is usually proportional to the number of rows of data processed. This can cause memory to not be applied if any of the following conditions affect execution performance
A, if the server is performing other similar memory consumption of large queries, resulting in insufficient system memory, the current query will have to delay, directly affect performance.
b, when the concurrency is too large, multiple queries compete with limited memory resources, the server will properly control concurrency and reduce throughput to maintain machine performance, which will also affect performance
C, if the current request to a small amount of available memory, SQL Server during execution and the disk to exchange data, usually using the tempdb temporary library operation, and this process is very slow. What's more, it exhausts the disk space on tempdb to end the failure
Typically, the operators that consume memory tend to have connection operations such as classification, hash joins , and hash aggregations .
2. Interrupt operation and non-blocking operation
The difference between the so-called blocking and non-blocking is whether the operator can directly output the result data when the data is entered.
A, when an operator generates an output line while consuming an input line, the operator is non-blocking.
For example, we often use the Select Top ... Operation, which is the output line operation at the same time as the input row, so this operation is non-blocking.
b, when an operator produces an output that waits for all data input, this operation is a block operation.
For example, in the case of the count (*) operation above, we need to wait for all data row inputs to be computed, so it is a blocking operation and a grouping calculation.
Tip: Not all blocking operations need to consume memory, such as COUNT (*) as a block, but it does not consume memory, but most blocking operations consume memory.
In most OLTP systems, we try to use the non-blocking operation instead of the blocking operation in order to better improve the time, for example, sometimes we use exists subquery to judge, than with select COUNT (*) >0 speed is more ideal.
Second, view the query plan
Above the SQL Server2005 version, the system provides three ways to display: Image mode, text mode, and XML mode.
1. Image mode
Image mode This is one of the most common ways to be clear, concise, and understandable. Very suitable for entry-level, of course, it has its own shortcomings such as complex T-SQL statements will produce larger images, see the need to shrink operations, more trouble.
SSMS defaults to provide us with a convenient button to view the query plan, when you need to view a statement, you just need to click on the
Let's take a look at an image-mode display of the query plan diagram
The actual execution plan produced by the above query statement is divided into different operators, from the rightmost clustered Index Scan (index Scan) and then through a series of operator processing to form the leftmost result output (select).
It is important to note that the direction of the arrows in the graph points to the flow of data, and the thickness of the arrow lines indicates the amount of data.
In a graphical execution plan, each of the different operators has its own property values, and we can move the mouse over the operator icon to view
Of course, you can also right-click on the icon, view the properties, go to the Properties panel and view more detailed property values
The detailed indicator values for each operator in this case are described later, but there are a few key values here, which can be said to mention a little bit first, about the performance parameters that affect the whole of this statement, we can choose the first select operator, right-click on the property value
This operator contains information such as the compilation time of the entire statement, the required memory, the size of the cache plan, the degree of parallelism, the memory authorization, the parameters required for compiling the execution, and the value of the variable.
This approach is presented to the user as a relatively straightforward way, so we have a lot of guidance in our statement tuning, and we know that a T-SQL statement can generate many different execution plans, and SQL Server helps us choose the optimal execution plan. Of course, we can also use its chosen execution plan to adjust its own statement to achieve the goal of optimization.
In view of the above objectives, SSMS provides us with an "Assessment execution plan" option, which is used only for evaluation guidance and is not actually executed, so it does not contain specific information such as actual rows
2. Text mode
This method does not provide shortcut keys by default in SSMs, we need to open the statement by ourselves, there are two ways to open it
A, only open the execution plan, not including the detailed evaluation value
SET Showplan_text on
B. Open all execution plan details, including evaluation values for each attribute
The way the text is displayed, without a clear arrow indication, use a vertical bar (|) To mark the child operator and the child parent of the current operation. and the flow direction is from the child operator to the parent operator, although the text presentation is not intuitive, but if you master the text reading, this way is easier to read, especially when it involves large large-scale plan, this way easier to save, process, search and compare.
Let's see a column.
This way the output is in the form of text, we can copy to the text editor for analysis, easy to find analysis and other operations
The above is the analysis of the text query plan, the simple point is to start from the innermost operator, the data flow direction is also in turn from the child operator flow to the parent operator.
The way above looks a little image way, easy to analyze and easier to use. But perhaps what is missing is the attribute operation information for each operator, which we use in the B method to view the detail
This way you can visually analyze the property evaluation values for each operator operation.
3. XML mode
The way XML presents the query plan is the new feature in SQL Server2005, which combines the advantages of text and graphical planning. The query plan is presented in the form of XML elements.
The main feature is that the use of XML is a normative way, can be programmed to use the standard XML operations, to facilitate the query. Also, XML data types are added to SQL Server2005, and XQuery functionality is built into the query. This approach is especially handy for viewing with very large query plans.
Open with the following statement
We can click on the output of the XML to view
XML shows a very detailed query plan information, we can simply analyze the next
- StmtSimple: Describes the execution text of T-SQL, and analyzes the type of the statement in detail, as well as the evaluation values for each property.
- Statementsetoptions: A set value that describes the various property values of the statement
- Queryplan: Is the detailed execution plan, including the number of parallel threads executing the plan, compile time, memory footprint, etc.
- Outputlist: Output parameter list
- In the middle this part is the specific information of the different execution operators, and includes the detailed pre-valuation and so on
- Parameterlist: Output parameter list
The information provided in XML is the most comprehensive, and the query plan type stored inside SQL Server is also an XML data type.
third, the analysis of the query plan
When we get a query plan for a statement, we should analyze the meaning of the execution plan inside, as well as the property values of the individual operators, and learn how to adjust the property values of each operator to improve the efficiency of the statement as a whole.
1. Scan and find
The two ways of scanning and finding (seek) are the basic ways in which the data in the database is obtained from the underlying data table.
A, when a table is a heap table (there is no index) or the data column gets no index to find, this kind of data can only be obtained through the full table scan filtering, if there is an index entry through the scanning of indexed items to obtain data, improve the speed of data acquisition.
This method is the simplest way to get the data
b, if there is an index entry for the currently searched data row, the index lookup (seek) is taken to retrieve the data.
This is the index lookup performed, because a nonclustered index entry exists in the OrderDate column in the Orders table. Here, by the way. If you introduce a static variable, SQL Server automatically parameterize the value to reduce the number of compilations and reuse the execution plan.
Because lookups only search for those pages that meet the criteria, the lookup efficiency is only proportional to the number of rows and pages that match the criteria, and does not relate to total rows in the entire table.
C, when the selected index column does not contain the output column, that is, to filter out the column items are not covered by the index, for this situation leads to another way to find
Bookmark Lookup (Bookmark lookup)
In fact, this approach is a compromise between scanning and finding, and we know that if you scan through a clustered index, you get all the columns, but this involves each row of data in the table, affecting performance, and conversely, if you just look through a clustered index, there are some columns that you can't get, if that's what we need. This is not accurate, so, in view of this, the introduction of a compromise way: Bookmark lookup (Bookmark lookup)
Simply put: Bookmark Lookup is the search for related column data by indexing page node data.
Let's look at a specific query.
Here to explain, in the SQL Server2005 SP2 version above, bookmark lookup is also called Key lookup, is actually a concept.
This approach has some drawbacks, that is, when the bookmark lookup, if the leaf node through the nonclustered index to find the clustered index data, this situation through the clustered index can quickly get to the data, if the nonclustered index keyword and the clustered index keyword does not have any association, in this case, Bookmark lookups perform random I/O operations into clustered indexes or heap tables, which can be time-consuming because sequential I/O scans are much more expensive than random I/O scans.
To address the problem described above, in the later versions of SQL Server2005, the Include keyword was introduced when index was created. By creating an index, you directly include the item that the bookmark is looking for, so that random I/O operations do not occur. The disadvantages of this approach can cause an increase in the index storage, but the benefits are largely negligible.
Conclusion
This article first of all, this article mainly describes the T-SQL statement tuning from the execution plan of the following, and introduced a few common simple operators, the next one will focus on some of our most commonly used operators and tuning techniques, including: Join operators, aggregation operators, union operators, parallel operations, etc. The content of SQL Server performance tuning involves a wide range of topics, followed by an analysis in the following article.
If you read this blog, feel that you have something to gain, please do not skimp on your " recommendation ".
SQL Server Tuning Series Basics