SQL Server Performance Optimization Analysis of SQL Execution Process

Source: Internet
Author: User

The topics in this article are as follows:

  • Query plan Overview
  • Query resolution
  • Query Optimizer
  • Query execution
  • Reuse of query plans
Query plan Overview

Most of the time, when we are using SQL server, it is very simple to do: enter an SQL statement, execute it, and finally obtain the result. Next, in order to make everyone better understand the internal mechanism of SQL server, we will re-examine the execution of an SQL statement.

After the SQL statement is mentioned, the database executes a series of internal processing. We can roughly divide the internal processing into two stages according to the execution sequence:

In the relational engine of the database, SQL query statements are parsed and the parsed results are passed to the query optimizer. The query optimizer is responsible for generating the execution plan. Then, the execution plan (in the second-level format) will be passed to the storage engine, and finally the underlying data will be returned or updated.

The database storage engine performs many operations, such as locking, index maintenance, and transaction processing.

Because this series of articles mainly analyze execution plans, our focus will be on the relational engine.

Next, we will discuss in detail the execution process of an SQL query statement.

Query resolution

As we have just mentioned: After an SQL statement is submitted to the database, the SQL statement is first passed into the relational engine.

When an SQL statement reaches the relational engine, the first operation is to check whether the SQL statement format is correct. This process is what we often call "resolution. The result of the parsing process is to generate a parsing tree, or a query tree. The query tree reflects the logical steps to perform a query. The structure of the query tree is similar to the following figure:

In fact, from the perspective of the compilation principle, This parsing process is the parsing of grammar and lexical, and finally the syntax tree is generated.

Note that if the submitted SQL statement is not a data operation statement (the data operation statement refers to the Select, Insert, and Update statements), the statement will not be optimized. For example, if the SQL statement submitted is to create a data table, the statement will not be optimized, but will be executed directly.

If the data operation statement is submitted, the parsing tree created by the relational engine will be passed to the algebrizer component for binding. During the binding process, the algebrizer component checks the table name in the resolution tree and whether the column name is associated with the reference of the corresponding table or object in the database.

The algebrizer component is also responsible for determining whether the types of each node in the resolution tree are consistent with those in the database. The algebrizer component traverses the tree from bottom to top, that is, starting from the page-level node, that is, the column and constant.

Binding and parsing is a very important process. In this process, we will also identify some aliases we have defined. After this process is completed, a binary "query processing Tree" is generated, which is passed to the query optimizer.

Query Optimizer

The query optimizer uses the Query Processing tree and related statistics to generate an execution plan.

In other words, the query optimizer points out how best to execute the submitted SQL statement. The query optimizer determines whether indexes can be used to access data. It is better to use that type of join operation (for example, although we sometimes write Left Join in SQL, the Query Optimizer may use Inner Join after analysis to ensure the same results ).

The query optimizer is an optimizer for basic cost analysis. This means that it will try to generate the lowest cost execution plan for each SQL statement.

In addition, we will briefly analyze the statistics used by the optimizer. The so-called statistical data refers to the data used to describe the column and index-related information in the database, that is, the data or "metadata ". The optimizer estimates the cost by combining the statistical data and Query Processing tree.

By default, statistics are automatically updated in the database (Manual updates can be performed during optimization ).

It should be mentioned that the table variables do not have any statistical data. That is to say, if the data in the table variables are queried, the optimizer will not make any optimization. However, temporary tables have corresponding statistics.

One thing to note is that the above cost is only "estimation. Some complex statements may have many execution plans as candidates. In this case, the query optimizer does not analyze all the combinations, but finds an execution plan close to the theoretical minimum. The planned cost is estimated the time required to complete the query. The lowest estimated cost is not necessarily the lowest resource cost.

Query execution

Once an execution plan is generated, the operation is transferred to the storage engine, which is also the place where the query is actually executed and the production of the actual execution plan generated based on estimation of the execution plan.

Reuse of query plans

From the previous steps, we can see that SQL Server requires a lot of steps and a lot of costs to generate an actual execution plan (excessive compilation of execution plans is often a big performance problem ), the execution plan must be reused as much as possible (if no special description is provided later, the execution plan refers to the actual execution plan). Therefore, once the execution plan is generated in the database, it is cached in the memory (called the plan buffer ).

As mentioned earlier, when the optimizer generates an estimated execution plan, the plan will be passed to the storage engine. In fact, before passing the estimated execution plan to the storage engine, the query optimizer will go to the "Plan buffer" to find the actual execution plan corresponding to the current estimated execution plan. If it is found, the query optimizer uses the execution plan for subsequent operations. This avoids re-generating the actual execution plan.

Generally, only one execution plan for each query is saved, unless the query optimizer knows that parallel execution can produce better performance, the execution plan for parallel query is cached, that is to say, the same query has two execution plans in the plan buffer.

The execution plan is not always stored in the memory. They will also expire. SQL Server removes execution plans that are not commonly used based on the least recently used algorithms. The following table lists the conditions for removing an execution plan:

Note: If you are familiar with. NET, you can compare it with. NET's garbage collection mechanism.

This is the next article. We will analyze the execution plan more!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.