Impact of statistical information on the Execution Plan (I)

Source: Internet
Author: User
Impact of statistical information on execution plans (1) We know that statistical information directly determines the execution plans produced by the relational engine, this article demonstrates two examples: 1. The impact of statistical information on the connection method 2. The impact of statistical information on the way data is obtained from a single table. The following is a training I have conducted. content, the test environment is SQLServer2008.

Impact of statistical information on execution plans (1) We know that statistical information directly determines the execution plans produced by the relational engine, this article demonstrates two examples: 1. The impact of statistical information on the connection method 2. The impact of statistical information on the way data is obtained from a single table. The following is a training I have conducted. content, the test environment is SQLServer 2008

Impact of statistical information on the Execution Plan (I)

We know that statistical information directly determines the execution plan generated by the relational engine. This article demonstrates two examples.

1. Impact of statistical information on connection modes

2. Impact of statistical information on data acquisition methods in a single table

The following is a training I have conducted. The test environment is SQLServer 2008 R2. If you do not understand what the statistical information is, click here

First, let's take a look at the impact of statistics on connection methods,

First, create a Test DB, set "auto create statistics" and "auto update statistics" to OFF, and then run the code to create related tables and write records.

Create table CHECK2_T1 (id int, C1 CHAR (800) create table CHECK2_T2 (id int, C1 CHAR (8000) CHECK2_T1 CHECK2_T2

Then execute the following two SQL statements. The SQL statement in ② uses the table prompt HASH to simulate the statistical information. The statistical information will be displayed later.

The following two SQL statements show that the result set is 0 based on the query conditions. Let's look at their execution plan.

① CHECK2_T2 B ②

The execution plan is as follows:

Comparing the preceding two execution plans, we found that the two table connections in Table 1 use the "merge join", the US space, and CHECK2_T2 are scanned by the entire table, now you should understand the meaning of simulating the CHECK2_T2 table into a record with an 8 K page. In fact, this table alone occupies 50 million pages and approximately 4 GB of disk space, it is slow to scan tables.

After reading the execution plan of ①, we know that optimization is to remove this "merge connection" and change it to "nested loop" or "hash connection" to achieve our results, as a result, in SQL, we force the "hash matching" request through the connection prompt. We can see that SQL is instantaneous and achieves our goal, of course, we can also change the "HASH" prompt to the "LOOP" prompt to let the query go through the "nested LOOP" method, and its effect is instantaneous. The query plan is as follows:

Now we can see that the desired execution plan is achieved by adding a connection prompt, but this is not our suggestion. The Hong Kong virtual host, we can see that the table scan icon in the execution plan has a yellow triangle exclamation mark, the U.S. space, move the mouse up, it prompts us that the table lacks XX statistics, for example, on CHECK2_T1, the system prompts that the C1 column lacks statistics, which directly leads to a bad execution plan for sqlserver. We manually create the WX_C1 statistical information on the C1 column of the CHECK2_T1 table,

After the creation is complete, execute the SQL statement ① again, and find that the default execution plan begins to use "hash matching". The query is also completed instantly. The execution plan is as follows:

So far, we have shown the impact of statistical information on table join in the execution plan. The next article will show the impact of statistical information on the single-Table query plan.

It should be noted that, in fact, I cannot clearly explain how parsing statistical information leads to different query plans. This is something inside the sqlserver engine and there are relatively few materials, however, you should understand the importance of statistical information. As Microsoft's official documentation says, expired statistical information may seriously affect system performance. It is best to keep the default database "auto create statistics and auto update statistics" open, unless you have enough reason not to use the auto create and auto update statistics function.

Posted on

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.