New design for cardinality estimation in SQL Server 2014 (for cardinality estimation)

Last Update:2015-07-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

For SQL Server databases, performance has been a topic that is not always around. And when we analyze and study performance issues, the execution plan is one of the priorities we've been focusing on.

We know that at compile time, SQL Server chooses a current best execution plan to execute the statement based on the statistics in the current database, and at a certain time, in combination with native resources.

So how does the database analysis engine use these statistics? The database engine calculates the approximate number of rows returned per operation based on the statistics in the database. This action is called the cardinality calculation (cardinality estimation). Based on this information, the database analysis engine determines the selection of logical or physical operators, operational costs, and so on, generates a series of execution plans and ultimately selects an appropriate execution plan.

In SQL Server 2014, cardinality calculations have changed significantly compared to previous versions, and these changes have an objective effect on the generation of execution plans. The new cardinality calculation does not add a new patch to the previous version, fixing some bugs, which can be said to be a rewrite, or even a mathematical model based on a change.

The new cardinality calculation is mainly applied to the DW (Data Warehouse) scenario, which brings a great performance improvement to the DW system.

As far as the effect is concerned, due to some changes in the mathematical model adopted, the new cardinality calculation is more accurate than before in the estimation of the number of rows returned.

The following two examples are a comparison of the old and new cardinality calculations.

1. Independence hypothesis

The test statements are as follows :

1 Select * 2  from Cars 3 Where Make=and=' Civic '

Run the above statement in the test database, where the number of rows in the table is 1000 rows, make= ' Honda ' has 200 rows, and model= ' Civic ' has 50 rows.

In the previous CE, it would be considered that these two filters are not OK, so the predicted return row number is 0.05 * 0.2 * 1000 = 10, and in the new version CE, it should be considered that the relationship between the two, so the exponential backoff algorithm, the predicted return value is 0.05 * sqrt (0.2) * 1000 = 22.36.

The actual number of rows returned is 50 rows.

Therefore, the new CE will be more conservative, in this case will be more accurate.

2. Changes in connection (join)

When an equivalent connection occurs, the following calculation method is used:

Select one of the less distinct values in two inputs
The value obtained by the above step is multiplied by the average frequency on both sides,

For example

The new cardinality calculations involve many modifications, such as modifications made to the ascending key scene, modifications to the statistical information method, and so on. However, some of the traditional content remains the same, for example, table variables are estimated as one row, local variables in stored procedures are considered unknown values, parameter sniffing problems can occur, and so on.

Overall, however, the new cardinality computing workload for the DW scenario brings an objective performance boost, including both compilation time and execution time.
In the foregoing, we refer to the statistics, in SQL Server 2014, there will be a new statistical information concept, incremental statistics (Incremental Statistics).

In general, statistical information is recorded in the column or index of the data distribution, data density and so on. When the user turns on automatic statistics updates, the statistics are automatically updated if the data changes by about 20%.

In the old version of the database, there are two disadvantages to the statistics: 1. For very large tables, the automatic statistics threshold for 20% is too large. 2. Rebuilding the statistics requires rescanning or resampling to scan the entire table, which is better if you can scan only new data.

As a target, SQL Server 2014 appears with a new feature increment statistic (Incremental Statistics).

Incremental Statistics has the following features:

It applies to partitioned tables, and major data updates occur in new partitions
Each partition has its own statistics object, and these statistics updates are consolidated by the global
Since most data changes occur in new partitions, when updating statistics, we only need to update the statistical update of the zone, and the system will update its statistics with other partitions. This avoids rebuilding statistics for other partitions.
The analysis engine uses global statistics rather than statistics for each partition.
When automatic statistics is turned on, for each partition, the triggered threshold is the data update for that partition 20%. Is 20% of the average partition size for the global.

Original link: http://blogs.msdn.com/b/apgcdsd/archive/2014/12/25/sql-2014-7-new-design-for-cardinality-estimation.aspx

New design for cardinality estimation in SQL Server 2014 (for cardinality estimation)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

New design for cardinality estimation in SQL Server 2014 (for cardinality estimation)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

New design for cardinality estimation in SQL Server 2014 (for cardinality estimation)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support