Explanation of SQL keyword execution order

Source: Internet
Author: User
Tags sql server query

explanation of SQL keyword execution order Category: Written interview summary2013-03-17 14:491622 People readComments (1)Collection ReportSQL keyword Order
the name of the field in the table does not add any symbols, the value of the words plus single quotation marks

There was a written examination of the knowledge points about the order in which the SQL keyword was executed.

We do software development, most people are inseparable from dealing with the database, especially the development of ERP, and database dealing with more frequent, storage process is thousands of rows, if the volume of data large, people flow, then I can also guarantee the next period of time the system can still run smoothly? Can I still make sure that the next person can read my stored procedures? Then I combined with the company's usual training and personal work experience at ordinary times and everyone to share, I hope to help you.
To know the SQL statements, I think we need to know how SQL Server Query Analyzer executes our SQL statements, and many of us will look at execution plans, or use Profiler to monitor and tune query statements or slow stored procedures, but if we know the execution logic order of the query parser , the hands of the time to be confident, then the start is not a certain point?
First, the logical execution order of the query
(1) from left_table
(3) Join_type join Right_table (2) on join_condition
(4) WHERE where_condition
(5) GROUP by group_by_list
(6) with {cube | rollup}
(7) Having having_condition
(8) SELECT (9) DISTINCT (one) top_specification select_list
(9) ORDER by order_by_list
The parsing order of standard SQL is:
(1) The FROM clause assembles data from different data sources
(2) WHERE clause filters records based on specified criteria
(3) GROUP BY clause divides data into multiple groupings
(4) Calculate with aggregate function
(5) using the HAVING clause to filter the grouping
(6) Calculating all expressions
(7) Use order by to sort the result set
Ii. Order of execution
1. From: Performs a Cartesian product generation of the first two tables in the FROM clause to generate a virtual table VT1
2. On: Apply the on filter to the VT1 table only the rows that satisfy the join_condition are inserted vt2
3. OUTER (Join): If the row rows that are not found in the OUTER join reservation table (preserved table) are added to the VT2 as outer rows, generate T3 if the from contains more than two tables, Repeat the steps and steps for the result table and the next table generated by the previous junction to end directly.
4. Where: Apply a where filter to vt3 only rows that make where_condition true are inserted VT4
5. GROUP BY: Groups the rows in the VT4 by the list of columns in the GROUP BY clause VT5
6. cube| ROLLUP: Inserting a Hyper-group (supergroups) into VT6, generating vt6
7. Having: Apply a have filter to vt6 only groups that make having_condition true are inserted vt7
8. Select: Process select list Generation Vt8
9. DISTINCT: Remove duplicate rows from vt8 to produce VT9
ORDER BY: Vt9 rows are generated by ordering the column list in the ORDER BY clause to generate a cursor VC10
TOP: Select a specified number or proportion of rows from the beginning of VC10 to generate vt11 and return the caller
See here, then use LINQ to SQL syntax a bit similar AH? If we understand the SQL Server execution order, then we will further develop the good habits of daily SQL, that is, in the implementation of the function of the idea of thinking about performance, the database is able to perform set operation of the tool, we should try to use this tool, so-called set operation is actually a batch operation, is to minimize the amount of large data on the client loop operation, and the use of SQL statements or stored procedures instead.
Iii. return only the data you need
The return data to the client needs at least the database extracts data, the network transmits the data, the client receives the data as well as the client processing data and so on, if returns the unnecessary data, will increase the server, the network and the client invalid labor, its harm is obvious, avoids this kind of event to need to notice:
A, Horizontal view
(1) Instead of writing the SELECT * statement, select the field you want.
(2) When you concatenate multiple tables in an SQL statement, use the alias of the table and prefix the alias to each column. In this way, you can reduce the parsing time and reduce the syntax errors caused by the column ambiguity.
1
If there are tables table1 (id,col1) and table2 (id,col2)
1
Select a.id, A.col1, b.col2
2
--Select a.ID, col1, col2– do not write this, not conducive to future program expansion
3
From table1 A inner joins table2 B on A.id=b.id Where ...
B, longitudinal view
(1) write the WHERE clause reasonably, and do not write the SQL statement without where.
(2) SELECT TOP N *-Use this substitution without a where condition.
Iv. doing repetitive work as little as possible
A, control the execution of the same statement multiple times, especially the number of basic data execution is rarely noticed by many programmers.
B, reduce the number of data conversion, may require data conversion is a design problem, but the reduction is the programmer can do.
C, eliminate unnecessary sub-queries and join tables, sub-query in the execution plan is generally interpreted as an external connection, redundant connection table brings additional overhead.
D. Combine multiple update of the same condition on the same table, such as
UPDATE EMPLOYEE SET fname= ' Haiwer '
WHERE emp_id= ' vpa30890f ' UPDATE EMPLOYEE SET lname= ' YANG '
WHERE emp_id= ' vpa30890f '
These two statements should be merged into one of the following statements
UPDATE EMPLOYEE SET fname= ' haiwer ', lname= ' YANG ' WHERE emp_id= ' vpa30890f '
E, update operation do not split into the form of delete operation +insert operation, although the function is the same, but the performance difference is very large.
V. To pay attention to the use of temporary tables and table variables
In complex systems, temporary tables and table variables are difficult to avoid, and for the use of temporal tables and table variables, it is important to note:
A, if the statement is complex and there are too many connections, consider stepping through the temporary table and table variables.
B, if you need to use the same part of a large table multiple times, consider staging this part of the data with temporary tables and table variables.
C, if you need to synthesize data from multiple tables to form a result, consider using temporal tables and table variables to summarize data for these tables in steps.
D, in other cases, you should control the use of temporary tables and table variables.
E, on the choice of temporary tables and table variables, many of the statements are table variables in memory, fast, should prefer table variables, but in actual use found:
(1) The main consideration is the amount of data that needs to be placed on the temporary table, and the temporary table will be faster than the amount of data.
(2) Execution time period and expected execution time (how long)
F, about temporary table generation using SELECT INTO and CREATE TABLE + INSERT into selection, generally:
The SELECT into will be much faster than the Create TABLE + INSERT into method.
However, select into locks the system tables sysobjects, sysindexes, and syscolumns in tempdb, which can easily block other processes in a multiuser concurrency environment.
So my advice is, in a concurrent system, try to use create TABLE + INSERT into, while a single statement of large data is used, using SELECT INTO.
Vi. usage of subqueries
A subquery is a select query that is nested within a SELECT, INSERT, UPDATE, DELETE statement, or other subquery.
Subqueries can be used wherever expressions are allowed, and subqueries make our programming flexible and can be used to implement special functions. But in performance, often an inappropriate subquery usage can create a performance bottleneck. If a subquery's condition uses a field of its outer table, this seed query is called a correlated subquery.
Correlated subqueries can be introduced with in, not in, EXISTS, not EXISTS. For related subqueries, it should be noted that:
(1) The related subquery for not in, not exists can use the left join instead of the notation. Like what:
Select Pub_name from Publishers where pub_id Notin (SELECT pub_id from the TITLES WHERE TYPE = ' business ')
Can be changed to write:
SELECT A.pub_name from Publishers A leftjoin TITLES B on b.type = "Business" and a.pub_id=b. pub_id WHERE b.pub_id is NULL
For example not EXISTS:
SELECT TITLE from TITLES
WHERE not EXISTS
(SELECT title_id from SALES WHERE title_id = TITLES. title_id)
1
Can be changed to write:
SELECT TITLE
From TITLES Leftjoin SALES
On SALES. title_id = TITLES. title_id
WHERE SALES. title_id ISNULL
2) If the subquery is guaranteed to be not duplicated, the in and exists related subqueries can be replaced with the inner JOIN. Like what:
SELECT pub_name
From publishers
WHERE pub_id in
(SELECT pub_id
From TITLES
WHERE TYPE = ' business ')
1
Can be changed to write:
SELECT a.pub_name--select DISTINCT a.pub_name
From publishers A Innerjoin TITLES B
On b.type = "Business" and
A.pub_id=b. pub_id
(3) In the correlation sub-query is replaced by exists, for example:
SELECT Pub_name From Publishers
WHERE pub_id in
(SELECT pub_id from the TITLES WHERE TYPE = ' business ')
You can use the following statement instead:
SELECT pub_name from Publishers WHERE EXISTS
(Select1from TITLES WHERE TYPE = ' business ' and
Pub_id= Publishers. pub_id)
4) Do not use the subquery of Count (*) to determine if there is a record, preferably with a left join or exists, for example, someone writes such a statement:
SELECT Job_desc from JOBS
WHERE (SelectCount (*) from EMPLOYEE WHERE job_id=jobs. job_id) =0
Should be changed to:
SELECT JOBS. Job_desc from JOBS leftjoin EMPLOYEE
On EMPLOYEE. Job_id=jobs. job_id
WHERE EMPLOYEE. emp_id ISNULL
  
SELECT Job_desc from JOBS
Where (SELECT COUNT (*) from EMPLOYEE WHERE job_id=jobs. job_id) 0
Should be changed to:
SELECT Job_desc from JOBS
Whereexists (SELECT 1 from EMPLOYEE WHERE job_id=jobs. JOB_ID)
Seven: Try to use the index
After indexing, not every query uses an index, and the efficiency of indexing can vary greatly when using indexes. As long as we do not enforce the specified index in the query statement, the selection and use of the index is an automatic choice of SQL Server optimizer, and it is selected based on the conditions of the query statement and the statistics of the related table, which requires us to write the SQL statement as far as possible to make the optimizer use the index. In order for the optimizer to use the index efficiently, it should be noted when writing statements:
(1) Do not operate on indexed fields, but try to make transformations, such as:
SELECT ID from T WHERE num/2=100
should read:
SELECT ID from T WHERE num=100*2

SELECT ID from T WHERE num/2=num1
If Num has an index, it should read:
SELECT ID from T WHERE num=num1*2
If the NUM1 has an index, it should not be changed.
(2) A statement was found:
SELECT year, month, amount from balance table where100* year + month =2010*100+10
1
should read:
SELECT year, month, amount from balance table WHERE year =2010 and month =10
(3) Do not format conversion of indexed fields
Examples of date fields:
Whereconvert (VARCHAR (10), date field, 120) = ' 2010-07-15 '
should read
Where date field 〉= ' 2010-07-15 ' and date field ' 2010-07-16 '

Examples of IsNull conversions:
Whereisnull (field, ') ' should be changed to: where Field '
Whereisnull (field, ') = ' should not be modified
Whereisnull (field, ' F ') = ' t ' should be changed to: where field = ' t '
Whereisnull (field, ' F ') ' t ' should not be modified
(4) Do not format conversion of indexed fields
Whereleft (NAME, 3) = ' abc ' or where SUBSTRING (name,1, 3) = ' abc '
Should read: WHERE NAME like ' abc% '
Examples of date queries:
Wheredatediff (day, date, ' 2010-06-30 ') =0
Should read: WHERE date = ' 2010-06-30 ' and date ' 2010-07-01 '
Wheredatediff (day, date, ' 2010-06-30 ') 0
Should read: WHERE date ' 2010-06-30 '
Wheredatediff (day, date, ' 2010-06-30 ') =0
Should read: WHERE date ' 2010-07-01 '
Wheredatediff (day, date, ' 2010-06-30 ') 0
Should read: WHERE date = ' 2010-07-01 '
Wheredatediff (day, date, ' 2010-06-30 ') =0
Should read: WHERE date = ' 2010-06-30 '
(5) Do not use functions on indexed fields
where left (NAME, 3) = ' abc ' or where SUBSTRING (name,1, 3) = ' abc '
Should read: WHERE NAME like ' abc% '
Examples of date queries:
Wheredatediff (day, date, ' 2010-06-30 ') =0
Should read: WHERE date = ' 2010-06-30 ' and date ' 2010-07-01 '
Wheredatediff (day, date, ' 2010-06-30 ') 0
Should read: WHERE date ' 2010-06-30 '
Wheredatediff (day, date, ' 2010-06-30 ') =0
Should read: WHERE date ' 2010-07-01 '
Wheredatediff (day, date, ' 2010-06-30 ') 0
Should read: WHERE date = ' 2010-07-01 '
Wheredatediff (day, date, ' 2010-06-30 ') =0
Should read: WHERE date = ' 2010-06-30 '
(6) Do not make multi-field connections to indexed fields
Like what:
WHERE fame+ '. ' +lname= ' Haiwei. YANG '

should read:
WHERE fname= ' Haiwei ' and lname= ' YANG '
Eight: The connection condition of the multi-table connection is of great significance to the selection of the index, so we need to pay special attention when we write the condition of the connection.
A, multi-table connection, the connection conditions must be written in full, rather repeat, do not leave gaps.
B, connection conditions use clustered index as far as possible
C, notice the difference between on, where and having partial conditions
On is the first execution, where the second, having the last, because on is the non-qualified records filtered before the statistics, it can reduce the intermediate operation to process the data, it should be said that the speed is the fastest, where also should be faster than having, Because it filters the data before it does sum, it is used on two table joins, so when a table is left, there is a comparison of where with having.
1
Consider the join precedence order:
2
(1) INNER JOIN
3
(2) LEFT join (note: Right join replaces with left join)
4
(3) Cross JOIN
Other areas of attention and understanding are:
A, in the face value of the list, will appear the most frequent values on the front, the least appears on the last side, reduce the number of judgments.
B, note the difference between Union and union all. --Allow duplicate data with union ALL good
C, pay attention to the use of distinct, do not use when not necessary.
D, TRUNCATE TABLE is different from DELETE.
E, reduce the number of access to the database.
There is we write stored procedures, if long, and finally with the mark descriptor open, because it is very readable, even if the statement is not good, but the statement is neat, C # Region,sql I prefer to use is:
--startof Enquiry for Number of employees
SQL statements
--end of
On the formal machine we generally can not easily debug the program, but many times the program in our machine is no problem, but into the formal system there is a problem, but we can not casually in the formal machine operation, then how to do? We can debug our stored procedures or SQL statements with rollback, thus troubleshooting.
Begintran
UPDATE a SET field = ' '
ROLLBACK
Job stored procedures I usually add the following paragraph, so that the check error can be placed in the stored procedure, if the error rollback operation, but if there is already a transaction rollback in the program, then the stored procedure does not write the transaction, which will cause the transaction rollback nesting to reduce the efficiency of execution, But most of the time we can put the check in the stored procedure, which helps us to interpret the stored procedure and debug the error.
BeginTransaction
--Transaction rollback begins

--Check the error
IF (@ @ERROR0)
BEGIN
--rollback operation
RollbackTransaction
RAISERROR (' Delete work report error ', 16, 3)
RETURN
END

--End transaction

CommitTransaction


Source: >  

From for notes (Wiz)

Explanation of SQL keyword execution order

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.