New design for cardinality estimation in SQL Server 2014 (for cardinality estimation)

Source: Internet
Author: User

For SQL Server databases, performance has been a topic that is not always around. And when we analyze and study performance issues, the execution plan is one of the priorities we've been focusing on.

We know that at compile time, SQL Server chooses a current best execution plan to execute the statement based on the statistics in the current database, and at a certain time, in combination with native resources.

So how does the database analysis engine use these statistics? The database engine calculates the approximate number of rows returned per operation based on the statistics in the database. This action is called the cardinality calculation (cardinality estimation). Based on this information, the database analysis engine determines the selection of logical or physical operators, operational costs, and so on, generates a series of execution plans and ultimately selects an appropriate execution plan.

In SQL Server 2014, cardinality calculations have changed significantly compared to previous versions, and these changes have an objective effect on the generation of execution plans. The new cardinality calculation does not add a new patch to the previous version, fixing some bugs, which can be said to be a rewrite, or even a mathematical model based on a change.

The new cardinality calculation is mainly applied to the DW (Data Warehouse) scenario, which brings a great performance improvement to the DW system.

As far as the effect is concerned, due to some changes in the mathematical model adopted, the new cardinality calculation is more accurate than before in the estimation of the number of rows returned.

The following two examples are a comparison of the old and new cardinality calculations.

1. Independence hypothesis

The test statements are as follows :

1 Select * 2  from Cars 3 Where Make=and=' Civic '

Run the above statement in the test database, where the number of rows in the table is 1000 rows, make= ' Honda ' has 200 rows, and model= ' Civic ' has 50 rows.

In the previous CE, it would be considered that these two filters are not OK, so the predicted return row number is 0.05 * 0.2 * 1000 = 10, and in the new version CE, it should be considered that the relationship between the two, so the exponential backoff algorithm, the predicted return value is 0.05 * sqrt (0.2) * 1000 = 22.36.

The actual number of rows returned is 50 rows.

Therefore, the new CE will be more conservative, in this case will be more accurate.

2. Changes in connection (join)

When an equivalent connection occurs, the following calculation method is used:

    • Select one of the less distinct values in two inputs
    • The value obtained by the above step is multiplied by the average frequency on both sides,

For example

The new cardinality calculations involve many modifications, such as modifications made to the ascending key scene, modifications to the statistical information method, and so on. However, some of the traditional content remains the same, for example, table variables are estimated as one row, local variables in stored procedures are considered unknown values, parameter sniffing problems can occur, and so on.

Overall, however, the new cardinality computing workload for the DW scenario brings an objective performance boost, including both compilation time and execution time.
In the foregoing, we refer to the statistics, in SQL Server 2014, there will be a new statistical information concept, incremental statistics (Incremental Statistics).

In general, statistical information is recorded in the column or index of the data distribution, data density and so on. When the user turns on automatic statistics updates, the statistics are automatically updated if the data changes by about 20%.

In the old version of the database, there are two disadvantages to the statistics: 1. For very large tables, the automatic statistics threshold for 20% is too large. 2. Rebuilding the statistics requires rescanning or resampling to scan the entire table, which is better if you can scan only new data.

As a target, SQL Server 2014 appears with a new feature increment statistic (Incremental Statistics).

Incremental Statistics has the following features:

    1. It applies to partitioned tables, and major data updates occur in new partitions
    2. Each partition has its own statistics object, and these statistics updates are consolidated by the global
    3. Since most data changes occur in new partitions, when updating statistics, we only need to update the statistical update of the zone, and the system will update its statistics with other partitions. This avoids rebuilding statistics for other partitions.
    4. The analysis engine uses global statistics rather than statistics for each partition.
    5. When automatic statistics is turned on, for each partition, the triggered threshold is the data update for that partition 20%. Is 20% of the average partition size for the global.

Original link: http://blogs.msdn.com/b/apgcdsd/archive/2014/12/25/sql-2014-7-new-design-for-cardinality-estimation.aspx

New design for cardinality estimation in SQL Server 2014 (for cardinality estimation)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.