SQL Server 2016 improves the Query Optimizer
The first two versions of SQL Server mainly improve performance by providing new features, while SQL Server 2016 mainly improves its existing functions.
Base Estimator
The base estimator is the core of all query optimizers. It will view the statistics of the queried table and the operations performed, and estimate the number of rows in each step of the query execution plan. Experienced DBAs know that incorrect base estimation seriously reduces database performance. Possible problems include:
- An incorrect index is selected;
- Incorrect join operators (such as nested loops, hash, and merge) are selected );
- Excessive memory allocation hinders other queries;
- Too little memory is allocated, resulting in excessive data overflow to tempdb.
Given the importance of the base estimator, you may be surprised that the base estimator used by SQL Server 2012 is basically the same as the base estimator introduced by SQL Server 7 in 1998. Just two years ago, we saw the first large-scale redesign of the base estimation process of the SQL Server Query Optimizer ". To learn more about the version, read the White Paper "optimize the query plan using SQL Server 2014 base estimator".
SQL Server 2016 is built on that task. When the compatibility level is set to 130, more accurate estimates can be provided. However, there may also be degradation. Therefore, before changing the Compatibility Level in the production environment, you need to thoroughly test the existing database.
Compatibility Level
You may not be familiar with this term. The compatibility level plays two key roles in SQL Server. First, you can determine which database features are available. When you set the database to a relatively low compatibility level, you will not be able to use some new features. Generally, this will not cause upgrade problems, because the database design takes into account older feature sets.
Another thing that is subject to compatibility level control is the query optimizer and base estimator used. In a well-tuned database, to reduce the possibility of performance degradation, you can choose a lower compatibility level, forcing SQL Server to use a query optimizer from an earlier version.
In some cases, you need more fine-grained control. For example, SQL Server 2016 allows you to set the compatibility level to 130 to use all the new features, but still uses the base estimator of the old version. This can be set through the following command:
ALTER DATABASE SCOPED CONFIGURATIONSET LEGACY_CARDINALITY_ESTIMATION = ON;
In addition to SQL Server 2008 R2, each version has a compatibility level of 10 higher than the previous version. Therefore, the compatibility level of SQL Server 2000 is 80, while that of SQL Server 2016 is 130. Each SQL Server version must support at least the previous two versions.
Multithreading insert
Before SQL Server 2016, the Insert-Select statement is multithreading only in the selection phase, while the actual Insert operation is serialized. Now, the insert operation can also be multi-threaded or have parallel plans ".
Memory Optimization table
The memory optimization table also supports multithreading.
Statistics
SQL Server 2016 has two statistical changes. First, the update frequency is higher when large tables are used.
In the past, when the number of changed rows reaches 20%, automatic update of statistics will be triggered, which is not suitable for large tables. Starting from SQL Server 2016 (Compatibility Level 130), this threshold is associated with the number of rows in the table. The more rows in the table, the lower the threshold for triggering Statistics Update. Note that in earlier versions, this row was controlled by Trace Flag 2371.
For example, if a table has 1 billion rows, automatic statistics update is enabled only when the number of changed rows reaches 0.2 billion in the previous behavior mode. In SQL Server 2016, only 1 million rows are required to trigger automatic statistics updates.
Continue to discuss parallelism. Currently, when the compatibility level is 130, statistics can be collected by a multi-threaded process sample ".
Foreign key constraint
A major selling point of relational databases is the ability to associate a table with other tables and use foreign key constraints to ensure data consistency. However, there are some overhead. Therefore, in SQL Server 2014 and earlier versions, the table's foreign key constraint is limited to 253.
You may think this number is large, but in a large database, when you begin to consider including audit columns, such as "CreatedByKey", it is easy to reach that limit. To alleviate this problem, Microsoft has increased the limit on incoming foreign key constraints to 10000. That is to say, you can reference a user table in thousands of tables at the same time. But there are some precautions.
First, this is not applicable to outgoing foreign key constraints, nor to self-reference tables. In these cases, the maximum number of foreign keys is still more than 200.
Second, the referenced table cannot be modified using the MERGE operation; only the DELETE and UPDATE operations are allowed. (In theory, SELECT and INSERT operations are also allowed, but they are not mentioned in this document, because they are not affected by the foreign key input constraints .)
Note: The term "relationship" in "Relational Database Management System" does not actually refer to the relationship between tables. More specifically, it is a data science term that refers to the relationship between each value in the row and all other values in the same row. In a PivotTable, each cell is an independent sum or average value. It is an example of a non-relational table.
Connection and foreign key constraints
As mentioned earlier, foreign key constraints are costly. If you modify a row that is potentially referenced by a foreign key constraint, the database needs to check to ensure that the constraint is not violated.
In SQL Server 2014, each table that references the preceding table is connected during the check. It is not hard to imagine that the overhead will soon become very large. To solve this problem, SQL Server introduces a new "Referential Integrity Operator )".
The new query executor performs a reference integrity check in place to compare the modified rows and the rows in the reference table to verify whether the modification will damage the reference integrity. This greatly reduces the Compilation Time and execution time of such plans.
Query Optimizer Improvements in SQL Server 2016
This article permanently updates the link address: