In-depth analysis of SQL Server query plans, in-depth analysis of SQL Server

Source: Internet
Author: User
Tags sql server query

In-depth analysis of SQL Server query plans, in-depth analysis of SQL Server

For SQL Server optimization, optimizing queries can be very common. Database optimization is also a broad topic. Therefore, this article only describes how to understand the SQL Server query plan when optimizing queries. After all, I have limited understanding of SQL Server. If you have any mistakes, please criticize and correct them in time.

First, open SQL Server Management Studio and enter a query statement to see how SQL Server displays the query plan.
Note: The database demonstrated in this article is a database specially prepared for a demo program, which can be downloaded on this webpage.

select v.OrderID, v.CustomerID, v.CustomerName, v.OrderDate, v.SumMoney, v.Finishedfrom  OrdersView as vwhere v.OrderDate >= '2010-12-1' and v.OrderDate < '2011-12-1';

Here, OrdersView is a view, which is defined as follows:

SELECT   dbo.Orders.OrderID, dbo.Orders.CustomerID, dbo.Orders.OrderDate,       dbo.Orders.SumMoney, dbo.Orders.Finished,       ISNULL(dbo.Customers.CustomerName, N'') AS CustomerNameFROM     dbo.Orders LEFT OUTER JOIN        dbo.Customers ON dbo.Orders.CustomerID = dbo.Customers.CustomerID

For the previous query, SQL Server provides the following query plan (click the show estimated execution plan button on the toolbar ):

In this figure, we can get at least three useful information:

1. Which execution steps are costly. Obviously, the cost of the rightmost two steps is relatively high.
2. Which execution steps produce a large amount of data. For the data volume generated by each step, the SQL Server execution plan is expressed by the line width, so it is easy to distinguish.
3. What actions are performed in each step.

For a slow query, we usually need to know which steps are expensive, and then we can try some improvement methods. Generally, if you cannot solve the problem by improving the hardware performance or adjusting the OS or setting the SQL Server, the following options are available:

1. Add the index of the corresponding field for the "scan" operation.
2. Sometimes re-indexing may also be effective. For details, refer to the following document.
3. Adjust the statement structure to guide SQL Server to use other query solutions for execution.
4. Adjust the table structure (by table or partition ).

Next we will talk about some important theoretical knowledge, which is helpful for the implementation plan.

Return to the top of the SQL Server to find records

Speaking of this, I have to say that the index of SQL Server is gone. SQL Server has two types of indexes: clustered index and non-clustered index. The difference between the two is that [clustered Index] directly determines the storage location of records, or you can directly obtain records based on Clustered indexes. [Non-clustered Index] stores two pieces of information: 1. The value of the corresponding index field, 2. records the location of the corresponding clustered index (if the table does not have a clustered index, the record pointer is saved ). Therefore, if you can use [clustered Index] To find records, it is obviously the fastest.

SQL Server provides the following methods to find the data records you need:

1. [Table Scan]: traverses the entire Table and finds all matched record rows. This operation will check one row at a time. Of course, the efficiency is also the worst.
2. [Index Scan]: according to the Index, some records are filtered out from the table, and then all matched record rows are searched. It is obviously smaller than the first method, therefore, it is faster than [Table Scan.
3. [Index Seek]: locate (obtain) the location where the record is stored Based on the Index and then obtain the record. Therefore, it is faster than the first two methods.
4. Clustered Index Scan: Same as Table Scan. Note: Do not think that there is an Index here, so it is different. In fact, it means that each row of records is scanned row by clustered index, because records are stored in the order of clustered index. [Table Scan] Only says: the Table to be scanned has no clustered index, so these two operations are essentially the same.
5. [Clustered Index Seek]: Obtain records directly based on the Clustered Index, the fastest!

Therefore, when you find that a query is slow, you can first check which operations are expensive, and then look at the operations in the query records, whether it is Table Scan or Clustered Index Scan. If it is related to the two operation types, you need to add an Index to solve the problem. However, adding an index will also affect the modification of the data table, because the index of the corresponding field must be updated when the data table is modified. Therefore, too many indexes may affect the performance. Another case is that it is not suitable for adding an index: the status of a field expressed as 0 or 1. For example, if the vast majority is 1, adding an index is meaningless. In this case, only the values 0 or 1 can be considered for separate storage. Table sharding or partition is a good choice.

If you cannot solve this problem by adding indexes and adjusting tables, you can try to adjust the statement structure to guide SQL Server to execute other query solutions. Requirements for this method: 1. The functions to be completed by the statement are clear; 2. The structure of the data table to be queried is clear; 3. The business background is clear. If this method can be used to solve the problem, it is also a good solution. However, sometimes SQL Server is intelligent and its execution plan will not be affected even if you adjust the statement structure.

How can we compare the performance of two SQL statements with the same function? I suggest using two methods: 1. directly place the two query statements in SQL Server Management Studio and view their execution plans ], SQL Server will tell you the [query overhead] of the two queries as a percentage ]. This method is simple and can be used as a reference. However, it is sometimes inaccurate. For the specific reason, proceed to the next step (the index statistics may be too old ).
2. according to the actual program call, write the corresponding test code to call: This method is more troublesome, but it is more representative of the actual call situation, and the results are more valuable for reference, therefore, it is worthwhile.

Back to Top SQL Server Join Mode

In SQL Server, each join command runs in three more specific ways:

1. [Nested Loops join]. If a join has a small input while another join has a large input and an index has been created on its join column, the index Nested Loops connection is the fastest join operation, because they require the least I/O and comparison.

Nested loop join is also known as "nested iteration". It uses a join input as an external input table (displayed as the top input in the graphic Execution Plan) and another join input as an internal (bottom) input table. External loops process external input tables row by row. The internal loop is executed for each external row and searches for matching rows in the internal input table. The following pseudo code can be used for understanding:

Foreach (row r1 in outer table) foreach (row r2 in inner table) if (r1, r2 meet the matching conditions) output (r1, r2 );

The simplest case is to scan the entire table or index during a search. This is called "nested loop join ". If an index is used for search, it is called "nested index loop join ". If an index is generated as part of a query plan (and the index is damaged immediately after the query is completed), it is called "temporary index nested loop join ". The query optimizer considers all these different situations.

Nested loop join is particularly effective if the external input is small and the internal input is large and the index is created in advance. In many small transactions (such as those that only affect a small group of rows), index nested loop joins are better than merge joins and hash joins. In large queries, nested loop join is usually not the best choice.

2. [Merge Join]. If two Join inputs are not small but sorted in the Join columns (for example, if they are obtained by scanning sorted indexes ), merge join is the fastest join operation. If both join inputs are large and the sizes of these two inputs are similar, the performance of the pre-sorted merge join is similar to that of the hash join. However, if the sizes of the two inputs differ greatly, the hash join operation is usually much faster.

Merge join requires that both inputs are sorted ON the merge columns, and the merge columns are defined by the equivalent (ON) clause of the join predicates. Generally, the query optimizer scans the index (if an index exists in an appropriate group of columns), or places a sort operator under the merged join. In rare cases, although there may be multiple equivalent clauses, only some of the available equivalent clauses can be used to obtain the merged columns.

Since each input is sorted, the Merge Join operator obtains and compares a row from each input. For example, for an inner join operation, if the rows are equal, return. If the rows are not equal, the rows with a smaller value are discarded and the other row is obtained from the input. This process repeats until all rows are processed.

The merge join operation can be a regular operation or multiple-to-multiple operation. Use temporary tables to store rows for multiple-to-multiple join operations (which may affect the efficiency ). If there are repeated values in each input, when processing each repeated item in one of the input, the other input must be rewound to the start position of the repeated item. You can create a unique index to tell SQL Server that there will be no duplicate values.

If a resident predicate exists, all rows that meet the merging predicate will take the value of the resident predicate, and only the rows that satisfy the resident predicate will be returned.

The merge join operation is fast, but it takes a lot of time to select a merge join if you want to sort the join operation. However, if the data volume is large and pre-ordered data can be obtained from the existing B-tree index, the merge join is usually the fastest available join algorithm.

3. [Hash Join], Hash Join can effectively process unordered large non-index input. They are useful for intermediate results of complex queries, because: 1. the intermediate results are not indexed (unless they have been explicitly saved to the disk and then created for the index) and are generally not sorted properly for the next operation in the query plan. 2. the query optimizer only estimates the size of intermediate results. For complex queries, the estimation may have a large error. Therefore, if the intermediate result is much larger than expected, the algorithm used to process the intermediate result must be effective and moderately weakened.

Hash join can reduce the use of non-standardization. Non-Standardization generally improves performance by reducing join operations, despite the risk of redundancy (such as inconsistent updates ). Hash join reduces the need for non-standardization. Hash join enables vertical partitioning (represented by several groups of columns in a single table using a separate file or index) to become a viable option for physical database design.

Hash join has two types of input: generate input and test input. The query optimizer assigns these roles so that the smaller two inputs are used as the generated input.

Hash join is used for Multiple matching operations: Internal join; left Outer Join, right Outer Join and full outer join; left half join and right half join; intersection; Union and difference. In addition, some deformation of the hash join can be deleted and grouped repeatedly, such as SUM (salary) group by department. These modifications only use one input for the production and test roles.

Hash join is divided into three types: Hash join in memory, Grace hash join, and recursive hash join.

Hash join in memory: Hash join scans or computes the entire generated input, and then generates a hash table in memory. Insert each row into a hash bucket based on the hash value of the calculated hash key. If the entire generated input is smaller than the available memory, all rows can be inserted into the hash table. The generation phase is the detection phase. Scan or compute the entire test input in one row, calculate the hash key value for each test row, scan the corresponding hash bucket, and generate matching items.

Grace hash join: If the generated input is greater than the memory size, hash join is performed in several steps. This is called "Grace hash join ". Each step is divided into the generation and detection phases. First, consume the entire generation and test input and partition It (using the hash function on the hash key) into multiple files. The hash function can be used to ensure that any two join records must be in the same file pair. Therefore, connecting two large input tasks is simplified to multiple smaller instances of the same task. Then, the hash join is applied to each pair of partition files.

Recursive hash join: If the generated input is so large that the standard external merging input requires multiple merging levels, multiple partition steps and multiple partition levels are required. If only some partitions are large, you only need to use the additional partition steps for those partitions. To make all the partition steps as fast as possible, a large asynchronous I/O operation will be used so that a single thread can make multiple disk drives busy.

During the optimization process, you cannot always determine which hash join to use. Therefore, at the beginning of SQL Server, the hash join in the memory is used, and then gradually converted to the Grace hash join and recursive hash join Based on the generated input size.
If the optimizer incorrectly predicts which of the two inputs is smaller and thus determines which one is used as the generated input, the generated and probe roles will be dynamically reversed. Hash join ensures that a small overflow file is used as the input. This technology is called "role reversal ". After at least one file overflows to the disk, role inversion occurs only in the hash join.

Note: You can also explicitly specify the connection mode. SQL Server tries its best to respect your choice. For example, you can write: inner loop join, left outer merge join, and inner hash join.
However, I recommend that you do not do this because the SQL Server is basically correct. If you do not believe it, try it.

Well, I 've talked about a lot of theoretical things. Let's explain it in another practical example.

Back to the top for more specific execution process

Previously, I gave an image that reflects the execution plan of SQL Server for executing a query. However, the information may not be very detailed. Of course, you can move the mouse indicator to a node and the following information appears:

I just installed a Chinese version with all Chinese characters on it, and I will not talk much about it. What I want to talk about is another method of execution process, which contains more execution information than this, and is the actual execution situation. (Alternatively, you can continue to use the graphic method. Click the [including the actual execution plan] button on the toolbar before running the query)

Let's go back to SQL Server Management Studio again, enter the following statement, and then execute it.

set statistics profile on select v.OrderID, v.CustomerID, v.CustomerName, v.OrderDate, v.SumMoney, v.Finishedfrom  OrdersView as vwhere v.OrderDate >= '2010-12-1' and v.OrderDate < '2011-12-1';

Note: After you add [set statistics profile on], the result is as follows:

You can see from the picture that after the query is executed, two tables are obtained. The above table shows the query results, and the following table shows the query execution process. Compared with the first image in this article, this image may be intuitive and unfriendly, but it can reflect more information, and it may seem easier, especially in complex queries, for complex queries, there are too many steps to execute the plan, and the graphic mode will lead to a large image, which is not easy to observe. In addition, this execution process table can reflect two valuable data records (the first two columns ).

Let's take a look at the execution process table. Let me pick out some important ones.
Rows indicates the number of records generated in one execution step. (Real data, unexpected)
[Executes]: the number of executions of a specific execution step. (Real data, unexpected)
[Stmt Text]: The description of the step to be executed.
[EstimateRows] indicates the expected number of rows of data to be returned.

In this execution process table, I think the first three columns are important for optimizing queries. For the first two columns, I have explained the above, and the meaning is clear. The numbers in the first two columns also roughly reflect the costs of those steps. For slow queries, pay attention to them. [Stmt Text] will tell you what is done in each step. For such a table, it is actually a tree-type information (a row represents a node in the graphic mode), so I suggest reading them from the innermost layer. As an example, I will explain the execution process in this table.

Row 5th: [Clustered Index Seek (OBJECT :( [MyNorthwind]. [dbo]. [MERs]. [PK_Customers]), SEEK :( [MyNorthwind]. [dbo]. [MERs]. [CustomerID] = [MyNorthwind]. [dbo]. [Orders]. [mermerid]) ordered forward], which means that SQL Server performs Seek operations on the table Customers and follows the [Clustered Index Seek] method, the corresponding index is pk_mers MERs, And the seek value comes from [Orders]. [CustomerID]

Row 4th: [Clustered Index Scan (OBJECT :( [MyNorthwind]. [dbo]. [Orders]. [PK_Orders]), WHERE :( [MyNorthwind]. [dbo]. [Orders]. [OrderDate]> = '2017-12-01 00:00:00. 000 'AND [MyNorthwind]. [dbo]. [Orders]. [OrderDate] <'2017-12-01 00:00:00. 000 ') means that SQL Server performs Scan on the table MERs, that is, the worst [Table Scan] method, because there is no index on the OrderDate column, so this is the only option.

Row 3rd: [Nested Loops (Left Outer Join, outer references :( [MyNorthwind]. [dbo]. [Orders]. [CustomerID]) means that SQL Server connects the data generated by rows 5th and 4th in the form of [Nested Loops], where the Outer table is Orders, the matching operation to be joined is also pointed out in row 5th.

Row 2nd: [Compute Scalar (DEFINE :( [Expr1006] = isnull ([MyNorthwind]. [dbo]. [MERs]. [CustomerName], n'') means that an isnull () function is called. For details, refer to the view definition code in the previous section.

Row 3: [SELECT [v]. [OrderID], [v]. [CustomerID], [v]. [CustomerName], [v]. [OrderDate], [v]. [SumMoney], [v]. [Finished] FROM [OrdersView] [v] WHERE [v]. [OrderDate]> = @ 1 AND [v]. [OrderDate] <@ 2]. Generally, the second row is the entire query, indicating its return value.

Back to the top index statistics: the basis for selecting a query plan

As mentioned above, [execution plan] indicates the operation plan that can be determined before specific execution. So how does SQL Server select an execution plan? How does SQL Server know when to use an index or which index to use? For SQL Server, whenever you want to execute a query, you must first check whether the execution plan of the query is in the cache. If not, you must generate an execution plan, when an execution plan is generated, instead of looking at which indexes are available (randomly selected), it will refer to a data called index statistics. If you carefully look at the preceding execution plan or execution process table, you will find that SQL Server can predict the amount of data generated by each step, because SQL Server can predict the amount of data, SQL Server can select a method that it deems the most appropriate to execute the query process. In this case, [index statistics] will be able to tell SQL Server the information. Speaking of this, are you a little curious? To give you a perceptual understanding of [index statistics], let's take a look at what [index statistics] look like. In [SQL Server Management Studio], enter the following statement and run it.

dbcc show_statistics (Products, IX_CategoryID)

The result is as follows:

First, explain the command: dbcc show_statistics. This command displays the index statistics we want to know. It requires two parameters: 1. Table Name, 2. Index name.

Let's take a look at the command result. It consists of three tables:
1. In the first table, it lists the primary information about the index statistics.

Column nameThe Name of the Name statistics. The date and time when the last Updated statistics was Updated. The number of Rows in the Rows table. The number of samples in Rows Sampled statistics. The number of groups that Steps data can be divided into, which corresponds to the third table. The prefix of the first index column of Density (excluding EQ_ROWS ). Average key length the Average length of all index columns. If the String Index is "yes", the statistical information contains the String summary Index, which can be used as the LIKE condition to estimate the size and size of the result set. Only applicable to leading columns of char, varchar, nchar, nvarchar, varchar (max), nvarchar (max), text, and ntext data types.

2. In the second table, it lists the selectivity of various field combinations. The smaller the data, the smaller the repetition, and the higher the selectivity.

Column nameDescribes the selectivity of the All density index column prefix set (including EQ_ROWS ). Note: A smaller value indicates a higher selectivity.
If the value is less than 0.1, the index will be highly selective. Otherwise, the index will not be highly selective. Average length: the Average length of the prefix set of the index column. Columns displays the name of the index column prefix of All density and Average length.

3. In the third table, SQL Server uses the histogram of data distribution to estimate the data volume of some execution steps.

Column nameIt indicates the maximum value in each group of RANGE_HI_KEY. The estimated number of rows in each data group of RANGE_ROWS, excluding the maximum value. The estimated number of rows in each data group in EQ_ROWS that are equal to the maximum value. DISTINCT_RANGE_ROWS: the estimated number of non-repeated values in each data group, excluding the maximum value. AVG_RANGE_ROWS: Average number of repeated values in each data group, excluding the maximum value. formula: RANGE_ROWS/DISTINCT_RANGE_ROWS for DISTINCT_RANGE_ROWS> 0

To help you better understand the data, especially the third group, see:

When I filled the test data, I intentionally divided the CategoryId into 1 to 8 (10 was temporarily added later), and each group filled with 78 pieces of data. Therefore, the data in the third table of [index statistics] is also correct. Based on these statistics, SQL Server can estimate the corresponding data volume for each execution step, this affects the choice of Join. Of course, when selecting the Join method, you should also refer to the selection of fields in the second table. When SQL Server generates an execution plan for a query, the query optimizer uses the statistics and evaluates the overhead of each scheme based on the relevant indexes to select the best query plan.

Another example illustrates the importance of statistical information for a query plan. First, add more data. See the following code:

Declare @ newCategoryId int; insert into dbo. categories (CategoryName) values (n' Test statistics '); set @ newCategoryId = scope_identity (); declare @ count int; set @ count = 0; while (@ count <100000) begin insert into Products (ProductName, CategoryID, Unit, UnitPrice, Quantity, Remark) values (cast (newid () as nvarchar (50), @ newCategoryId, N ', 100, @ count + 1, n''); set @ count = @ count + 1; endgoupdate statistics Products; go

Let's take a look at the index statistics:

Let's take a look at the same query, but the execution plan selected by SQL Server because the query parameter values are different:

Select p. productId, t. quantity from Products as p left outer join [Order Details] as t on p. productId = t. productId where p. categoryId = 26; -- 26 is the newly generated CategoryId. Therefore, this query returns 10 million select p records. productId, t. quantity from Products as p left outer join [Order Details] as t on p. productId = t. productId where p. categoryId = 6; -- 95 records are returned for this query.

As you can see, because the CategoryId parameter values are different, SQL Server selects a completely different execution plan. The importance of statistical information is clearly reflected here.

After statistical information is created, the database engine sorts the column values (Create statistical information based on these values) and sorts these values (a maximum of 200 values are separated by intervals) create a histogram ". The histogram specifies how many rows exactly match each interval value, how many rows are within the interval range, and the density of the values in the interval or the occurrence rate of repeated values.

SQL Server 2005 introduces other information collected for Statistics created on char, varchar, varchar (max), nchar, nvarchar, nvarchar (max), text, and ntext columns. This information is called a "string Digest" and can help the query optimizer to estimate the selectivity of query predicates in string mode. When the LIKE condition is found in a query, you can use the string digest to estimate the size and size of the result set more accurately and optimize the query plan continuously. These conditions include such conditions as WHERE ProductName LIKE '% Bike' and WHERE Name LIKE '[CS] heryl.

Since [index statistics] is so important, when will it be generated or updated? In fact, we do not need to manually maintain [index statistics]. SQL Server will automatically maintain them. In addition, SQL Server also has a parameter to control the update method:

Automatic statistical functions

When an index is created, the query optimizer automatically stores statistics about the index column. In addition, when the AUTO_CREATE_STATISTICS database option is set to ON (default value), the database engine automatically creates statistics for columns without indexes used for predicates.

As the data in the column changes, the index and column statistics may be out of date, resulting in the query processing method selected by the query optimizer being not the best. For example, if you create a table that contains an index column and 1,000 rows of data, the values of each row in the index column are unique, the query optimizer regards this index column as a good way to collect and query data. If many duplicate values exist after updating the data in the column, this column is no longer an ideal candidate column for query. However, the query optimizer still regards the index as a good candidate Column Based on the outdated Index Distribution Statistics (based on the data before the update.

When the AUTO_UPDATE_STATISTICS database option is set to ON (default), the query optimizer automatically updates the statistics periodically when the data in the table changes. When the statistics used in the query execution plan fail to pass the test on the current statistics, the statistics are updated. Sampling is performed randomly on each data page. The minimum non-clustered index is obtained from the columns required for tables or statistics. After reading a data page from a disk, all rows on the data page are used to update statistics. Generally, the statistical information is updated when about 20% of data rows change. However, the query optimizer always ensures that the number of sampled rows is as small as possible. For tables smaller than 8 MB, complete scanning is always performed to collect statistics.

Sampling Data (instead of analyzing all data) can minimize the overhead of automatic statistics update. In some cases, statistical sampling cannot obtain precise features of table data. You can use the SAMPLE clause and FULLSCAN clause of the update statistics statement to control the amount of data sampled when you manually UPDATE statistical information one by one. The FULLSCAN clause specifies that all data in the scan table is used to collect statistics. The SAMPLE clause specifies the percentage of the number of sampled rows or the number of sampled rows.

In SQL Server 2005, the database option AUTO_UPDATE_STATISTICS_ASYNC provides the asynchronous Statistics update function. When this option is set to ON, the query can be compiled without waiting for statistics to be updated. The expired statistics are placed in the queue and updated by the worker thread in the background process. Queries and any other concurrent queries are compiled immediately by using the existing expired statistics. The query response time is predictable because there is no delay in the statistics waiting for updates. However, the expired statistics may lead to inefficient query plans for the query optimizer. When the updated statistics are ready, the statistics will be used for queries started. This may cause re-Compilation of the cache Plan (depending on the older statistical information version ). If some Data Definition Language (DDL) Statements (such as CREATE, ALTER, and DROP statements) appear in the same explicit user transaction, asynchronous statistics cannot be updated.

The AUTO_UPDATE_STATISTICS_ASYNC option is set at the database level and used to update all statistics in the database. It is only applicable to Statistics Updates and cannot be used to create statistics asynchronously. This option is valid only when AUTO_UPDATE_STATISTICS is set to ON. By default, the AUTO_UPDATE_STATISTICS_ASYNC option is set to OFF.

From the above description, we can see that for large tables, there is still a possibility that the statistical information is not updated in a timely manner, then the query optimizer's judgment may be affected.
Some people may have an experience: for some slow queries, they will think of rebuilding the index to try to solve the problem. In fact, this is justified. In some cases, a query suddenly slows down, which may be related to the untimely updating of statistics, thus affecting the judgment of the query optimizer. If you re-create the index at this time, you can let the query optimizer know the latest data distribution and naturally avoid this problem. Do you still remember the execution process table that I used to display with set statistics profile on? Note that the table shows the actual data volume and estimated data volume in each step. If you want to re-create an index, you can use [set statistics profile on] to check whether the actual data volume differs greatly from the estimated data volume. Therefore, you can manually update the statistics, then try again.

Return to the top optimization view for query

Optimization View query: although a view is defined by a query statement, it is essentially a query, but it is different from a general query statement in optimization. The main difference here is that although a view is defined by a query statement, it may be of little significance if you only analyze the query definition, because most views are not directly used, before use, the where statement is added or placed in other statements for use by the from clause. The following is an example. In my demo database, there is a view OrdersView, Which is prefixed with the definition code. Let's take a look at what execution plans will be available if this view is used directly:

From this view, we can see that SQL Server performs a full table scan on the table Orders, which should be very inefficient. Let's take a look at the following query:

From this execution plan, we can see that it is different from the above one. In the previous query, we used the Clustered Index Scan method for searching the Orders table, but now we are using the Clustered Index Seek method, the cost percentage for the rightmost two steps has also changed. This is enough to explain that it is best to apply different filtering conditions based on actual needs to optimize the view, and then decide how to optimize the view.

Let's take a look at the execution plan of this view based on three queries.

select * from dbo.OrdersView where OrderId = 1;select * from dbo.OrdersView where CustomerId = 1;select * from dbo.OrdersView where OrderDate >= '2010-12-1' and OrderDate < '2011-12-1';

Obviously, for the same view, the execution plan varies significantly under different filtering conditions.

Recommended reading-MSDN article

Index statistics
Http://msdn.microsoft.com/zh-cn/library/ms190397 (SQL .90). aspx

Query Optimization suggestions
Http://msdn.microsoft.com/zh-cn/library/ms188722 (SQL .90). aspx

List used to analyze slow queries
Http://msdn.microsoft.com/zh-cn/library/ms177500 (SQL .90). aspx

Logical operators and physical Operators
Http://msdn.microsoft.com/zh-cn/library/ms191158 (SQL .90). aspx


What is the difference between SQL server indexing and non-Indexing in the query execution plan?

The difference is,
Indexed, usually index scan or clustered index scan (seek)
Table scan (sacn)
Now the query analyzer is very intelligent. Some tables do not necessarily use indexes when they have indexes. It will judge io and pre-read to automatically use table scanning and index scanning.

Does the SQL Server Query analyzer use GO in pairs or separately?

GO
Sends a batch of signals to the SQL Server utility to end the SQL statements.

Example 1:
USE AdventureWorks;
GO
DECLARE @ MyMsg VARCHAR (50)
SELECT @ MyMsg = 'hello, World .'
GO -- @ MyMsg is not valid after this GO ends the batch.

-- Yields an error because @ MyMsg not declared in this batch.
PRINT @ MyMsg
GO

SELECT @ VERSION;
-- Yields an error: Must be EXEC sp_who if not first statement in
-- Batch.
Sp_who
GO

The SQL Server application can send multiple Transact-SQL statements as a wholesale statement to an SQL Server instance for execution. Then, the statements in the batch are compiled into an execution plan. When a programmer runs a special statement in the SQL Server utility or generates a script for a Transact-SQL statement in the SQL Server utility, the programmer uses GO as the signal of batch completion.

Example 2
USE AdventureWorks;
GO
DECLARE @ NmbrContacts int
SELECT @ NmbrContacts = COUNT (*)
FROM Person. Contact
PRINT 'the number of contacts as of '+
CAST (GETDATE () AS char (20) + 'is +
CAST (@ NmbrContacts AS char (10 ))
GO

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.