SQL Server optimized SQL statement optimization

Source: Internet
Author: User

Everything is for performance, everything is for business

First, the logical execution order of the query

(1) from left_table (3) Join_type join Right_table (2) in Join_condition (4) WHERE where_condition (5) GROUP by group_by_list (6) with {cube | rollup} (7) has having_condition (8) SELECT (9) DISTINCT (one) top_specification select_list (9) ORDER by Order_by_list

The parsing order of standard SQL is:

(1) The FROM clause assembles data from different data sources

(2) WHERE clause filters records based on specified criteria

(3) GROUP BY clause divides data into multiple groupings

(4) Calculate with aggregate function

(5) using the HAving clause to filter the grouping

(6) Calculating all expressions

(7) Use order by to sort the result set

Ii. Order of execution

1. From: Performs a Cartesian product generation of the first two tables in the FROM clause to generate a virtual table VT1

2. On: Apply the on filter to the VT1 table only the rows that satisfy the join_condition are inserted vt2

3. OUTER (Join): If the row rows that are not found in the OUTER join reservation table (preserved table) are added to the VT2 as outer rows, generate T3 if the from contains more than two tables, Repeat the steps and steps for the result table and the next table generated by the previous junction to end directly.

4. Where: Apply a where filter to vt3 only rows that make where_condition true are inserted VT4

5. GROUP BY: Groups the rows in the VT4 by the list of columns in the GROUP BY clause VT5

6. cube| ROLLUP: Inserting a Hyper-group (supergroups) into VT6, generating vt6

7. Having: Apply a have filter to vt6 only groups that make having_condition true are inserted vt7

8. Select: Process select list Generation Vt8

9. DISTINCT: Remove duplicate rows from vt8 to produce VT9

ORDER BY: Vt9 rows are generated by ordering the column list in the ORDER BY clause to generate a cursor VC10

TOP: Select a specified number or proportion of rows from the beginning of VC10 to generate vt11 and return the caller

See here, then use LINQ to SQL syntax a bit similar AH? If we understand the SQL Server execution order, then we will further develop the good habits of daily SQL, that is, in the implementation of the function of the idea of thinking about performance, the database is able to perform set operation of the tool, we should try to use this tool, so-called set operation is actually a batch operation, is to minimize the amount of large data on the client loop operation, and the use of SQL statements or stored procedures instead.

Iii. return only the data you need

The return data to the client needs at least the database extracts data, the network transmits the data, the client receives the data as well as the client processing data and so on, if returns the unnecessary data, will increase the server, the network and the client invalid labor, its harm is obvious, avoids this kind of event to need to notice:

A, Horizontal view

(1) Instead of writing the SELECT * statement, select the field you want.

(2) When you concatenate multiple tables in an SQL statement, use the alias of the table and prefix the alias to each column. In this way, you can reduce the parsing time and reduce the syntax errors caused by the column ambiguity.

If there are tables table1 (id,col1) and table2 (id,col2) Select a.ID, A.col1, b.col2--select a.ID, col1, col2– do not write this, not conducive to future program extension from table1 A Inn Er join table2 B on A.id=b.id Where ...

B, longitudinal view

(1) write the WHERE clause reasonably, and do not write the SQL statement without where.

(2) SELECT TOP N *– does not have a where condition with this substitution.

Iv. doing repetitive work as little as possible

A, control the execution of the same statement multiple times, especially the number of basic data execution is rarely noticed by many programmers.

B, reduce the number of data conversion, may require data conversion is a design problem, but the reduction is the programmer can do.

C, eliminate unnecessary sub-queries and join tables, sub-query in the execution plan is generally interpreted as an external connection, redundant connection table brings additional overhead.

D. Combine multiple update of the same condition on the same table, such as

Update employee set fname= ' Haiwer ' where emp_id= ' vpa30890f ' UPDATE employee set lname= ' YANG ' where emp_id= ' vpa30890f '

These two statements should be merged into one of the following statements

UPDATE EMPLOYEE SET fname= ' haiwer ', lname= ' YANG ' WHERE emp_id= ' vpa30890f '

E, update operation do not split into the form of delete operation +insert operation, although the function is the same, but the performance difference is very large.

V. To pay attention to the use of temporary tables and table variables

In complex systems, temporary tables and table variables are difficult to avoid, and for the use of temporal tables and table variables, it is important to note:

A, if the statement is complex and there are too many connections, consider stepping through the temporary table and table variables.

B, if you need to use the same part of a large table multiple times, consider staging this part of the data with temporary tables and table variables.

C, if you need to synthesize data from multiple tables to form a result, consider using temporal tables and table variables to summarize data for these tables in steps.

D, in other cases, you should control the use of temporary tables and table variables.

E, on the choice of temporary tables and table variables, many of the statements are table variables in memory, fast, should prefer table variables, but in actual use found:

(1) The main consideration is the amount of data that needs to be placed on the temporary table, and the temporary table will be faster than the amount of data.

(2) Execution time period and expected execution time (how long)

F, about temporary table generation using SELECT INTO and CREATE TABLE + INSERT into selection, generally:

The SELECT into will be much faster than the Create TABLE + INSERT into method.

However, select into locks the system tables sysobjects, sysindexes, and syscolumns in tempdb, which can easily block other processes in a multiuser concurrency environment.

So my advice is, in a concurrent system, try to use create TABLE + INSERT into, while a single statement of large data is used, using SELECT INTO.

Vi. usage of subqueries

A subquery is a select query that is nested within a SELECT, INSERT, UPDATE, DELETE statement, or other subquery.

Subqueries can be used wherever expressions are allowed, and subqueries make our programming flexible and can be used to implement special functions. But in performance, often an inappropriate subquery usage can create a performance bottleneck. If a subquery's condition uses a field of its outer table, this seed query is called a correlated subquery.

Correlated subqueries can be introduced with in, not in, EXISTS, not EXISTS. For related subqueries, it should be noted that:

(1) The related subquery for not in, not exists can use the left join instead of the notation. Like what:

Select Pub_name from Publishers where pub_id Notin (SELECT pub_id from the TITLES WHERE TYPE = ' business ')

Can be changed to write:

SELECT A.pub_name from Publishers A leftjoin TITLES B on b.type = "Business" and a.pub_id=b. pub_id WHERE b.pub_id is NULL

For example not EXISTS:

Select TITLE from Titleswhere not EXISTS (select title_id from SALES WHERE title_id = TITLES. title_id)
Can be changed to write:
SELECT titlefrom TITLES leftjoin saleson SALES. title_id = TITLES. Title_idwhere SALES. title_id ISNULL

2) If the subquery is guaranteed to be not duplicated, the in and exists related subqueries can be replaced with the inner JOIN. Like what:

Select Pub_namefrom publisherswhere pub_id in (select Pub_idfrom titleswhere TYPE = ' business ')

Can be changed to write:

SELECT a.pub_name--select DISTINCT a.pub_namefrom Publishers A innerjoin TITLES BON b.type = ' Business ' ANDA. Pub_id=b. pub_id

(3) In the correlation sub-query is replaced by exists, for example:

Select Pub_name from Publisherswhere pub_id in (SELECT pub_id from TITLES WHERE TYPE = ' business ')

You can use the following statement instead:

SELECT pub_name from Publishers where EXISTS (Select1from TITLES where TYPE = ' business ' andpub_id= publishers. pub_id)

4) Do not use the subquery of Count (*) to determine if there is a record, preferably with a left join or exists, for example, someone writes such a statement:

SELECT Job_desc from Jobswhere (SelectCount (*) from EMPLOYEE WHERE job_id=jobs. job_id) =0

Should be changed to:

SELECT JOBS. Job_desc from JOBS leftjoin employeeon EMPLOYEE. Job_id=jobs. Job_idwhere EMPLOYEE. emp_id Isnullselect Job_desc from Jobswhere (SELECT COUNT (*) from EMPLOYEE WHERE job_id=jobs. JOB_ID)

Should be changed to:

Select Job_desc from Jobswhereexists (select 1 from EMPLOYEE WHERE job_id=jobs. JOB_ID)
Seven: Try to use the index

After indexing, not every query uses an index, and the efficiency of indexing can vary greatly when using indexes. As long as we do not enforce the specified index in the query statement, the selection and use of the index is an automatic choice of SQL Server optimizer, and it is selected based on the conditions of the query statement and the statistics of the related table, which requires us to write the SQL statement as far as possible to make the optimizer use the index. In order for the optimizer to use the index efficiently, it should be noted when writing statements:

(1) Do not operate on indexed fields, but try to make transformations, such as:

SELECT ID from T WHERE num/2=100

should read:

SELECT ID from T WHERE num=100*2
SELECT ID from T WHERE num/2=num1

If Num has an index, it should read:

SELECT ID from T WHERE num=num1*2

If the NUM1 has an index, it should not be changed.

(2) A statement was found:

SELECT year, month, amount from balance table where100* year + month =2010*100+10

should read:

SELECT year, month, amount from balance table WHERE year =2010 and month =10

(3) Do not format conversion of indexed fields

Examples of date fields:

WHERE CONVERT (VARCHAR (10), date field, 120) = ' 2010-07-15 '

should read

WHERE Date field 〉= ' 2010-07-15 ' and date field ' 2010-07-16 '

Examples of IsNull conversions:

where ISNULL (field, ")" should be changed to: where Field "

WHERE ISNULL (field, ") =" should not be modified

where ISNULL (field, ' F ') = ' t ' should be changed to: where field = ' t '

WHERE ISNULL (field, ' F ') ' t ' should not be modified

(4) Do not format conversion of indexed fields

where left (NAME, 3) = ' abc ' or where SUBSTRING (name,1, 3) = ' abc '

Should read: WHERE NAME like ' abc% '

Examples of date queries:

WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE date = ' 2010-06-30 ' and date ' 2010-07-01 '
WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE Date ' 2010-06-30 '
WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE Date ' 2010-07-01 '
WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE date = ' 2010-07-01 '
WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE date = ' 2010-06-30 '

(5) Do not use functions on indexed fields

where left (NAME, 3) = ' abc ' or where SUBSTRING (name,1, 3) = ' abc '

should read:

WHERE NAME like ' abc% '

Examples of date queries:

WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE date = ' 2010-06-30 ' and date ' 2010-07-01 '
WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE Date ' 2010-06-30 '
WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE Date ' 2010-07-01 '
WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE date = ' 2010-07-01 '
WHERE DATEDIFF (day, date, ' 2010-06-30 ') =0

should read:

WHERE date = ' 2010-06-30 '

(6) Do not make multi-field connections to indexed fields

Like what:

WHERE fame+ '. ' +lname= ' Haiwei. YANG '

should read:

WHERE fname= ' Haiwei ' and lname= ' YANG '
Eight: The connection condition of the multi-table connection is of great significance to the selection of the index, so we need to pay special attention when we write the condition of the connection.

A, multi-table connection, the connection conditions must be written in full, rather repeat, do not leave gaps.

B, connection conditions use clustered index as far as possible

C, notice the difference between on, where and having partial conditions

On is the first execution, where the second, having the last, because on is the non-qualified records filtered before the statistics, it can reduce the intermediate operation to process the data, it should be said that the speed is the fastest, where also should be faster than having, Because it filters the data before it does sum, it is used on two table joins, so when a table is left, there is a comparison of where with having.

(1) INNER JOIN

(2) LEFT join (note: Right join replaces with left join)

(3) Cross JOIN

Other areas of attention and understanding are:

A, in the face value of the list, will appear the most frequent values on the front, the least appears on the last side, reduce the number of judgments.

B, note the difference between Union and union all. – Allow duplicate data with union ALL good

C, pay attention to the use of distinct, do not use when not necessary.

D, TRUNCATE TABLE is different from DELETE.

E, reduce the number of access to the database.

There is we write stored procedures, if long, and finally with the mark descriptor open, because it is very readable, even if the statement is not good, but the statement is neat, C # Region,sql I prefer to use is:

–startof Enquiry for Number of employees

SQL statements

–end of

On the formal machine we generally can not easily debug the program, but many times the program in our machine is no problem, but into the formal system there is a problem, but we can not casually in the formal machine operation, then how to do? We can debug our stored procedures or SQL statements with rollback, thus troubleshooting.

Begintran

UPDATE a SET field = "

ROLLBACK

Job stored procedures I usually add the following paragraph, so that the check error can be placed in the stored procedure, if the error rollback operation, but if there is already a transaction rollback in the program, then the stored procedure does not write the transaction, which will cause the transaction rollback nesting to reduce the efficiency of execution, But most of the time we can put the check in the stored procedure, which helps us to interpret the stored procedure and debug the error.

BeginTransaction

– Transaction rollback begins

– Check for errors

IF (@ @ERROR0) begin--rollback operation Rollbacktransactionraiserror (' Delete work report error ', 3) returnend

– End Transaction

CommitTransaction

SQL Server optimized SQL statement optimization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.