SQL Server Full Optimization-------Writing good statements is a habit

Source: Internet
Author: User
Tags sorts

The previous articles have provided the basic idea of diagnosing all aspects of the database from the whole ... Maybe it's very useful to you, maybe you feel too far away from yourself. So today we are going to make an introduction from some of the sentence optimization and some simple optimization methods. This is useful for many developers! To facilitate reading, give the previous link:

SQL Server fully optimized-------Expert for SQL Server Diagnostic series

There are a lot of things on the internet, a lot of many SQL statement optimization of the article, what optimization, 100 of the optimization note, did organize a lot of many. So why should I join in the fun and write a piece? Well, I don't know either!

    

--------------Blog Address---------------------------------------------------------------------------------------

Expert Diagnostic Optimization Series http://www.cnblogs.com/double-K/

No more nonsense, just open the whole-----------------------------------------------------------------------------------------.

    • Heavy---statement execution order

QQ Group and people chatting when suddenly there is a group of friends said: I know the original statement to go index is filtered by the field of select! Plausible, very sure! Another group of friends asked about update? It looks like a small white question, but it really shocked me! So let's take a look at the execution order of the statements

If I remember correctly, this is the first section of the first chapter of the book, "SQL SERVER 2005 Tech Insider-inquiry". The author of the book also wants to let the reader first understand the sentence is what kind of order of execution, because do not know the order of what to write a good statement?

Logical execution order of queries:

(1) from < left_table>

(3) < join_type> join < right_table> (2) on < join_condition>

(4) WHERE < where_condition>

(5) GROUP by < group_by_list>

(6) with {cube | rollup}

(7) Having < having_condition>

(8) SELECT (9) DISTINCT (one) < top_specification> < select_list>

(Ten) ORDER by < order_by_list>

The parsing order of standard SQL is:

(1). The FROM clause assembles data from different data sources

(2). The WHERE clause filters records based on the specified criteria

(3). The GROUP BY clause divides the data into multiple groupings

(4). Using aggregate functions for calculations

(5). Use the HAVING clause to filter the grouping

(6). Calculate All expressions

(7). Use order by to sort the result set

Execution order:

1.FROM: Performs a cartesian product of the first two tables in the FROM clause to generate a virtual table VT1

2.ON: Apply on filter to VT1 table only rows that meet < join_condition> are inserted VT2

3.OUTER (Join): Adds rows that are not found in the OUTER join reservation table (preserved table) as outer rows to the VT2 Generate T3 If the from contains more than two tables repeat steps and steps for the result table and the next table generated by the previous junction directly end

4.WHERE: Apply a WHERE filter to vt3 only rows that make < where_condition> true are inserted VT4

5.GROUP by: Grouping rows in VT4 by the list of columns in the GROUP BY clause VT5

6.cube| ROLLUP: Inserts a Hyper-group (supergroups) into the VT6 generation VT6

7.HAVING: Apply a have filter to vt6 only groups that make < having_condition> true are inserted vt7

8.SELECT: Process SELECT list Generation Vt8

9.DISTINCT: Remove duplicate rows from vt8 to produce VT9

10.ORDER by: Sorts the rows of VT9 by the column list in the ORDER BY clause to generate a cursor VC10

11.TOP: Select a specified number or proportion of rows from the beginning of VC10 to generate vt11 and return the caller

  We understand the SQL Server execution sequence, please do not know before the crossing, repeated trial and repeated memory! Then we will further develop the daily SQL good habits, that is, in the implementation of the function while considering the performance of the idea!

    • Design ideas

Optimization of the specific wording please don't worry, it's all pediatrics!

The design idea said a bit big, the following describes a few of the most common design problems!

  

Cycle Change Batch

  Circular single operation, please change to batch operation, if there is no way to modify, please try to modify! This is the most common:

    1. Apply the code side A for loop and then the nausea point of each open close connection, run a few minutes, the number is a few hours large. Please put your each for loop out of the results in a datatable,list what, do not find a piece on the database to write a bar!
    2. The cursor in the database is almost the same, if it is possible not to use a cursor to cycle a single processing, please try to avoid using. If you think you have to use it, ask others if they can do it in other ways!
    3. If there is no way to avoid a single write, then it is shown that opening a transaction before processing begins Tran is much faster than not opening the display transaction after processing is done!

Last Small example:

CREATE TABLE test_0607 (a int,b nvarchar ()) declare @i int set @i = 1while @i < 10000begin insert INTO Test_0607selec T @i, ' 0607 no show whole transaction ' Set @i = @i + 1end
drop table Test_0607create Table test_0607 (a int,b nvarchar)---Plus transaction begin trandeclare @i int Set @i = 1while @i < 10000begin INSERT INTO Test_0607select @i, ' 0607 show whole transaction ' Set @i = @i + 1end----END transaction, Commit commit

Result: 8 seconds and 0.8 seconds difference, don't say anything more! There are pros and cons, and this shows that the whole process of opening big business is not going to take a particularly long time, and if you do a lot of work and a long time is a disaster!

  

  Reduce statement complexity

  As already described in the previous statement optimization kick, reducing the complexity of statements is a common optimization method. Here, there are two reasons why a statement is particularly complex:

    1. The logic of the program itself is complex, requiring many tables to join, but also to sort and aggregate, from time to time to several subqueries, plus a few functions.
    2. Because the business has a lot of commonality, so many views are created, even views are nested in many layers of views, and the final outer layer is associated with a single module of the particularity table.

In the first case, the code looks very long and complex, and the code that looks pretty good actually looks very low for the master. And for the second, it seems that the code is very concise, but after the SQL optimizer two compile, in fact, and the first is not different. Both of these solutions are to reduce complexity, and some can be split out as far as possible to split into temporary tables or table variables, such as the condition of the more highly selective table Association, and then put the results into a temporary table, with the temporary table and other tables associated. It can be understood that I have 10 tables associated, I first take 5 tables out of the association, and then put the results into a temporary table, and then associated with another 5 tables. In this way, the complexity of the query becomes 5+6 by the combination of 10 tables, which reduces the complexity of complex statements.

The same is true for complex views, where a temporary table is placed before the View and Outer Association, and then associated with the outer layer.

The same is true of subqueries, which can be separated into sub-queries of temporary tables, separated out first.

For table-valued functions, there are also inline and table-valued points:

  

---mode 1: inline CREATE FUNCTION [dbo]. [Tvf_inline_test] () RETURNS TABLE as    RETURN     SELECT  ProductID     from    sales.salesorderheader Soh             INNER JOIN Sales.SalesOrderDetail sod on Soh. SalesOrderID = Sod. SalesOrderID---This notation can be combined with an outer query for two compilations (that is, you can take advantage of the outer context and where condition)---way 2: Table-valued Create FUNCTION [dbo]. [Tvf_multi_test] () RETURNS @SaleDetail TABLE (ProductId INT) as     BEGIN          INSERT into  @SaleDetail                 SELECT  productid< C11/>from    sales.salesorderheader Soh                         INNER JOIN sales.salesorderdetail sod on Soh. SalesOrderID = Sod. SalesOrderID          RETURN      END---This notation does not apply the outer condition filter if the data volume assembly has an impact on performance.

  High-energy warning: here is the appropriate use of temporary tables, I met a lot of developers generally have such a process. Start a huge complex statement, knowing that with temporary tables, a temporary table is used for operations with small steps. This will put a lot of pressure on your tempdb!

For details, see: Expert Diagnostic Optimization Series------------------to cool tempdb

  Avoid repeated reads

  There have been many such procedures, similar to the product has a variety of analysis, and each analysis to do some different processing, but they will read the same basic data commodity and product details. Many programs follow each analysis as a separate stored procedure, which means that there are 20 processes in which they create 20 stored procedures, and the first step in each stored procedure is to read the underlying data first-goods and details, and so on. Unfortunately, the product and commodity details have a huge amount of data, although the sub-table (according to the month, each table about 2QW data), but each stored procedure to read a year of data, about 2QW * 12, so huge amount of data, query was put into a temp table, 20 stored procedures executed in sequence, That is, the basic data will be queried 20 times per night! Basically this processing occupies the system nightly maintenance all the time, and sometimes even can not run out to affect the daytime normal business!

Maybe you'll laugh when you read the description, who's going to design it like this? Is that a joke? Yes, it's super simple to solve this problem and synthesize one of the 20 stored procedures. Let the basic data query only once, put in the temporary table, create the following logical processing required index, in this temporary table to do all the following processing. Such a night need to run more than 6 hours of processing is shortened to 40 minutes! (Of course, a bit exaggerated, there are some other optimizations, √)

It is mentioned here that the use of temporary table is an important problem, that is similar to the large amount of data written to the temporary table, be sure to use the first create and then insert the way, do not directly use the SELECT into temporary table, otherwise it is a disaster!

    • On the importance of index

    It's a cliché, I think all the companies are hiring and asking questions like: What are indexes, what classes are indexed, and how are they different? Wait a minute....

What is an index? What is a clustered index? What are non-clustered indexes? What is a primary key lookup? What is a primary key scan? What is index lookup? What is Bookmark lookup? What's the difference? here are not introduced, please own Baidu!

Many developers do not realize that indexing is important to the statement or even to the system. Follow -up articles on the importance of indexing to your system.

    How to build an index

In the simplest and most brutal way, when you're finished writing a statement, open the execution plan and follow the optimizer prompts to create the index, see:

There are a lot of things on the internet, a lot of many SQL statement optimization of the article, what optimization, 100 of the optimization note, did organize a lot of many. So why should I join in the fun and write a piece? Well, I don't know either!

    

--------------Blog Address---------------------------------------------------------------------------------------

Expert Diagnostic Optimization Series http://www.cnblogs.com/double-K/

No more nonsense, just open the whole-----------------------------------------------------------------------------------------.

    • Heavy---statement execution order

QQ Group and people chatting when suddenly there is a group of friends said: I know the original statement to go index is filtered by the field of select! Plausible, very sure! Another group of friends asked about update? It looks like a small white question, but it really shocked me! So let's take a look at the execution order of the statements

If I remember correctly, this is the first section of the first chapter of the book, "SQL SERVER 2005 Tech Insider-inquiry". The author of the book also wants to let the reader first understand the sentence is what kind of order of execution, because do not know the order of what to write a good statement?

Logical execution order of queries:

(1) from < left_table>

(3) < join_type> join < right_table> (2) on < join_condition>

(4) WHERE < where_condition>

(5) GROUP by < group_by_list>

(6) with {cube | rollup}

(7) Having < having_condition>

(8) SELECT (9) DISTINCT (one) < top_specification> < select_list>

(Ten) ORDER by < order_by_list>

The parsing order of standard SQL is:

(1). The FROM clause assembles data from different data sources

(2). The WHERE clause filters records based on the specified criteria

(3). The GROUP BY clause divides the data into multiple groupings

(4). Using aggregate functions for calculations

(5). Use the HAVING clause to filter the grouping

(6). Calculate All expressions

(7). Use order by to sort the result set

Execution order:

1.FROM: Performs a cartesian product of the first two tables in the FROM clause to generate a virtual table VT1

2.ON: Apply on filter to VT1 table only rows that meet < join_condition> are inserted VT2

3.OUTER (Join): Adds rows that are not found in the OUTER join reservation table (preserved table) as outer rows to the VT2 Generate T3 If the from contains more than two tables repeat steps and steps for the result table and the next table generated by the previous junction directly end

4.WHERE: Apply a WHERE filter to vt3 only rows that make < where_condition> true are inserted VT4

5.GROUP by: Grouping rows in VT4 by the list of columns in the GROUP BY clause VT5

6.cube| ROLLUP: Inserts a Hyper-group (supergroups) into the VT6 generation VT6

7.HAVING: Apply a have filter to vt6 only groups that make < having_condition> true are inserted vt7

8.SELECT: Process SELECT list Generation Vt8

9.DISTINCT: Remove duplicate rows from vt8 to produce VT9

10.ORDER by: Sorts the rows of VT9 by the column list in the ORDER BY clause to generate a cursor VC10

11.TOP: Select a specified number or proportion of rows from the beginning of VC10 to generate vt11 and return the caller

  We understand the SQL Server execution sequence, please do not know before the crossing, repeated trial and repeated memory! Then we will further develop the daily SQL good habits, that is, in the implementation of the function while considering the performance of the idea!

    • Design ideas

Optimization of the specific wording please don't worry, it's all pediatrics!

The design idea said a bit big, the following describes a few of the most common design problems!

  

Cycle Change Batch

  Circular single operation, please change to batch operation, if there is no way to modify, please try to modify! This is the most common:

    1. Apply the code side A for loop and then the nausea point of each open close connection, run a few minutes, the number is a few hours large. Please put your each for loop out of the results in a datatable,list what, do not find a piece on the database to write a bar!
    2. The cursor in the database is almost the same, if it is possible not to use a cursor to cycle a single processing, please try to avoid using. If you think you have to use it, ask others if they can do it in other ways!
    3. If there is no way to avoid a single write, then it is shown that opening a transaction before processing begins Tran is much faster than not opening the display transaction after processing is done!

Last Small example:

CREATE TABLE test_0607 (a int,b nvarchar ()) declare @i int set @i = 1while @i < 10000begin insert INTO Test_0607selec T @i, ' 0607 no show whole transaction ' Set @i = @i + 1end
drop table Test_0607create Table test_0607 (a int,b nvarchar)---Plus transaction begin trandeclare @i int Set @i = 1while @i < 10000begin INSERT INTO Test_0607select @i, ' 0607 show whole transaction ' Set @i = @i + 1end----END transaction, Commit commit

Result: 8 seconds and 0.8 seconds difference, don't say anything more! There are pros and cons, and this shows that the whole process of opening big business is not going to take a particularly long time, and if you do a lot of work and a long time is a disaster!

  

  Reduce statement complexity

  As already described in the previous statement optimization kick, reducing the complexity of statements is a common optimization method. Here, there are two reasons why a statement is particularly complex:

    1. The logic of the program itself is complex, requiring many tables to join, but also to sort and aggregate, from time to time to several subqueries, plus a few functions.
    2. Because the business has a lot of commonality, so many views are created, even views are nested in many layers of views, and the final outer layer is associated with a single module of the particularity table.

In the first case, the code looks very long and complex, and the code that looks pretty good actually looks very low for the master. And for the second, it seems that the code is very concise, but after the SQL optimizer two compile, in fact, and the first is not different. Both of these solutions are to reduce complexity, and some can be split out as far as possible to split into temporary tables or table variables, such as the condition of the more highly selective table Association, and then put the results into a temporary table, with the temporary table and other tables associated. It can be understood that I have 10 tables associated, I first take 5 tables out of the association, and then put the results into a temporary table, and then associated with another 5 tables. In this way, the complexity of the query becomes 5+6 by the combination of 10 tables, which reduces the complexity of complex statements.

The same is true for complex views, where a temporary table is placed before the View and Outer Association, and then associated with the outer layer.

The same is true of subqueries, which can be separated into sub-queries of temporary tables, separated out first.

For table-valued functions, there are also inline and table-valued points:

  

---mode 1: inline CREATE FUNCTION [dbo]. [Tvf_inline_test] () RETURNS TABLE as    RETURN     SELECT  ProductID     from    sales.salesorderheader Soh             INNER JOIN Sales.SalesOrderDetail sod on Soh. SalesOrderID = Sod. SalesOrderID---This notation can be combined with an outer query for two compilations (that is, you can take advantage of the outer context and where condition)---way 2: Table-valued Create FUNCTION [dbo]. [Tvf_multi_test] () RETURNS @SaleDetail TABLE (ProductId INT) as     BEGIN          INSERT into  @SaleDetail                 SELECT  productid< C11/>from    sales.salesorderheader Soh                         INNER JOIN sales.salesorderdetail sod on Soh. SalesOrderID = Sod. SalesOrderID          RETURN      END---This notation does not apply the outer condition filter if the data volume assembly has an impact on performance.

  High-energy warning: here is the appropriate use of temporary tables, I met a lot of developers generally have such a process. Start a huge complex statement, knowing that with temporary tables, a temporary table is used for operations with small steps. This will put a lot of pressure on your tempdb!

For details, see: Expert Diagnostic Optimization Series------------------to cool tempdb

  Avoid repeated reads

  There have been many such procedures, similar to the product has a variety of analysis, and each analysis to do some different processing, but they will read the same basic data commodity and product details. Many programs follow each analysis as a separate stored procedure, which means that there are 20 processes in which they create 20 stored procedures, and the first step in each stored procedure is to read the underlying data first-goods and details, and so on. Unfortunately, the product and commodity details have a huge amount of data, although the sub-table (according to the month, each table about 2QW data), but each stored procedure to read a year of data, about 2QW * 12, so huge amount of data, query was put into a temp table, 20 stored procedures executed in sequence, That is, the basic data will be queried 20 times per night! Basically this processing occupies the system nightly maintenance all the time, and sometimes even can not run out to affect the daytime normal business!

Maybe you'll laugh when you read the description, who's going to design it like this? Is that a joke? Yes, it's super simple to solve this problem and synthesize one of the 20 stored procedures. Let the basic data query only once, put in the temporary table, create the following logical processing required index, in this temporary table to do all the following processing. Such a night need to run more than 6 hours of processing is shortened to 40 minutes! (Of course, a bit exaggerated, there are some other optimizations, √)

It is mentioned here that the use of temporary table is an important problem, that is similar to the large amount of data written to the temporary table, be sure to use the first create and then insert the way, do not directly use the SELECT into temporary table, otherwise it is a disaster!

    • On the importance of index

    It's a cliché, I think all the companies are hiring and asking questions like: What are indexes, what classes are indexed, and how are they different? Wait a minute....

What is an index? What is a clustered index? What are non-clustered indexes? What is a primary key lookup? What is a primary key scan? What is index lookup? What is Bookmark lookup? What's the difference? here are not introduced, please own Baidu!

Many developers do not realize that indexing is important to the statement or even to the system. Follow -up articles on the importance of indexing to your system.

    How to build an index

In the simplest and most brutal way, when you're finished writing a statement, open the execution plan and follow the optimizer prompts to create the index, see:

Expert Diagnostic Optimization Series------------------statement tuning kick

    

    High-energy WARNING: Here you need your condition to be indexed! For example, your statement in the index column can not take a function, can not participate in the calculation such as where PRODUCTID/2 = @a, cannot have implicit conversion, etc.!

   

   

   

   

   

Once indexed, the index is not used by every query, and the efficiency of the index is significantly different when using the index. As the above missing index we add to query later!

    

    Index lookup (seek), generally optimal (but also look for the filter to find), try to make a field in the Where condition is a composite index, and contains the fields to query for select. This is not going to go any further.

Understand execution plan creation

How to understand the execution plan this is a topic that can write hundreds of pages of books, but to understand the implementation plan is the top priority of optimization! This will be explained in more detail in future articles.

Through the execution plan can see where the main consumption of the statement, in addition to set statistics IO on and other analysis read times, is also the key to optimization, create or optimize the index page is the main starting from here.

    • Statement General Habits

  

  Return only the data you need

The return data to the client needs at least the database extracts data, the network transmits the data, the client receives the data as well as the client processing data and so on, if returns the unnecessary data, will increase the server, the network and the client invalid labor, its harm is obvious, avoids this kind of event to need to notice:

Horizontal view:

    1. Instead of writing the SELECT * statement, select the field you want.
    2. When you concatenate multiple tables in an SQL statement, use the alias of the table and prefix the alias to each column. This reduces the time to parse and reduces the syntax errors caused by column ambiguity. See also: Careful is important---guess what this SQL execution means

In Portrait view:

    1. The Where condition should be as much as possible and ensure a high level of screening.
    2. It is common in the business to return large quantities of data to the front end, but is this data really necessary? Can the front end add some default conditions?

  Reduce unnecessary operations

Before you write your statement, clear your thoughts!

    1. Eliminate unnecessary table joins, one more link represents a lot of overhead.
    2. Reduce unnecessary conditions to judge, many times the foreground incoming empty is worth when the background statement is written xx=xx or xx is NULL or xx like or ... OR ... Or and so on. This is a more classic question, please add to judge in the final conditions!
    3. Do you need to repeat the statement? Distinct, union and other operations
    4. The difference between the left join and the inner join, whether or not you really need to join, otherwise choose inner join to reduce unnecessary data return.
    5. ORDER BY does your statement need to be sorted? Can sorting reduce performance consumption by index? I've seen even inserting data with ORDER by!

  

  Try to filter as early as possible

    1. The most classic example is the difference between where and having, and you should have seen the order of the statement execution. It can be written in the where and not in the having.
    2. Use temporal tables to reduce the complexity of the statement, to reduce the amount of data in the temporal table, that is, to associate a conditional table with a temporary table.
    3. The implicit conversions mentioned earlier, indexed fields that use calculations or functions, can also cause data to not be filtered as early as possible.

  Common mistakes in writing (the following are online one-sided conclusions)

All the methods that others have mentioned are invalid.

    1. Or to use UNION all instead (or is a very conventional way of writing, there are many kinds of cases, a table of two conditions with a.a =x or a.a = XX , a table two fields with A.A =x or a.b = x, two different table fields with A.A = X or B.A = X This is what the internet says union ALL replaces)
    2. Avoid using in and not (there is no problem when the amount of data is small, if the data volume is large it may affect performance, the data volume processing method first put in the data in the temporary table)
    3. The transaction operation process should be as small as possible, splitting the transaction to be split apart. (in the example mentioned earlier, some cases are written in a loop, showing that opening a large transaction can be helpful)
    4. Using the WITH (NOLOCK) query statement is not blocked (in general, but can be blocked if you have schema modifications or snapshot publications such as using with (NOLOCK))
    5. Replace in with exists (the situation is also very complex not generalize)

--------------Blog Address---------------------------------------------------------------------------------------

Expert Diagnostic Optimization Series http://www.cnblogs.com/double-K/

-----------------------------------------------------------------------------------------------------

Summary: Write here, said sentence optimization, there is too much attention, these need to understand the principle, can understand the implementation plan, and constantly accumulate.

Only a few of the optimization of the help is very small, but also hands-on practice, understand why this write will be good!

     

-------------------------------------------------------------------------------------------------

Today's thoughts are a bit chaotic ... Because today is a special day, not because of the college entrance examination, is because of "Warcraft", this let me play eight years of the game, full of youthful blood. With a full memory, Let's Get up tonight!

  

 Mention of Warcraft excited to fill on the path of personal learning, a few recommended books have been uploaded network disk.

Download Link: HTTP://PAN.BAIDU.COM/S/1KUDRPYF

----------------------------------------------------------------------------------------------------

Original link: http://www.cnblogs.com/double-K/archive/2016/06/02/5538249.html

For the convenience of reading a series of articles on the Guide link:

SQL Server fully optimized-------Expert for SQL Server Diagnostic series

SQL Server Full Optimization-------Writing good statements is a habit

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.