Various stages in SQL Server query processing (SQL execution order)

Source: Internet
Author: User
Tags aliases sql server query

The most obvious feature of SQL differs from other programming languages is the order in which the code is processed. In a large number programming language, code is processed in encoded order, but in the SQL language, the first processed clause is the FROM clause, although the SELECT statement first appears, but is almost always finally processed.

Each step produces a virtual table that is used as input to the next step. These virtual tables are not available to callers (client applications or external queries). Only the table generated in the last step is returned to the caller. If you do not specify a clause in the query, the corresponding step is skipped. The following is a brief description of each logical step that is applied to SQL Server 2000 and SQL Server 2005.


(8) SELECT (9) DISTINCT (one) <top num> <select list>
(1) from [left_table]
(3) <join_type> join <right_table>
(2) on <join_condition>
(4) WHERE <where_condition>
(5) GROUP by <group_by_list>
(6) With <cube | Rollup>
(7) Having (10) ORDER by <order_by_list>

Introduction to the Logical query processing phase

    1. from: performs a cartesian product (Cartesian product) (cross join) on the first two tables in the FROM clause, generating a virtual table VT1
    2. on: The applies an on filter to the VT1. Only those lines that make <join_condition> true are inserted into the VT2.
    3. OUTER (join): If you specify a OUTER join (relative to a cross join or (INNER join), keep the table (preserved Table: Left outer join marks the left table as a reserved table, right outer join marks the right table as a reserved table, a full outer join marks two tables as a reserved table) and no matching rows are found in the VT2 as an outer row. Generate VT3. If the FROM clause contains more than two tables, repeat steps 1 through 3 for the result table and the next table that you generated for the previous join until you have finished processing all the tables.
    4. Where: applies a where filter to VT3. Only rows that make <where_condition> true are inserted VT4.
    5. GROUP By: generates VT5 by grouping rows in VT4 by the list of columns in the GROUP BY clause.
    6. cube| ROLLUP: Inserts a Hyper-group (suppergroups) into VT5, generating VT6.
    7. has: has a having filter applied to VT6. Only groups that make
    8. Select: processes the select list, producing VT8.
    9. DISTINCT: removes duplicate rows from VT8, resulting in VT9.
    10. ORDER BY: generates a cursor (VC10) by sorting the rows in VT9 by the list of columns in the ORDER by clause.
    11. TOP: selects the specified number or scale of rows from the beginning of the VC10, generates the table VT11, and returns the caller.

Note: Step 10, sort the rows returned by the column list in the ORDER BY clause, and return the cursor VC10. This step is the first and only step you can use the column aliases in the select list. This step differs from the other step in that it does not return a valid table, but instead returns a cursor. SQL is based on the set theory. The collection does not pre-order its rows, it is only a logical collection of members, and the order of the members is irrelevant. A query that sorts a table can return an object that contains rows organized in a specific physical order. ANSI calls this object a cursor. Understanding this step is the basis for a proper understanding of SQL.

Because this step does not return a table (instead of returning a cursor), a query that uses the ORDER BY clause cannot be used as a table expression. Table expressions include: views, inline table-valued functions, subqueries, derived tables, and common expressions. Its result must be returned to the client application that expects to get the physical record. For example, the following derived table query is invalid and produces an error:



As D

The following view also generates an error

CREATE VIEW My_view
As
SELECT *
From Orders
ORDER BY OrderID

In SQL, a query with an ORDER BY clause is not allowed in a table expression, but there is an exception in T-SQL (apply top option).

So remember, don't assume any particular order for the rows in the table. In other words, do not specify an ORDER BY clause unless you are sure you want to order rows. Sorting is a cost, and SQL Server needs to perform an ordered index scan or use the sort runner.
Recommend a SQL code: row and column transpose


/* Question: Suppose there is a student score table (TB) as follows:
Name Course Score
Zhang San language 74
Zhang San Mathematics 83
Zhang San Physics 93
John Doe Language 74
John Doe Mathematics 84
John Doe Physics 94

Want to become (get the following result):
Name Chinese mathematics Physics
---- ---- ---- ----
Lee 474 84 94
Sheet 374 83 93
-------------------
*/

CREATE table TB (name varchar (10), course varchar (10), fractional int)
INSERT into TB values (' Zhang San ', ' language ', 74)
INSERT into TB values (' Zhang San ', ' math ', 83)
INSERT into TB values (' Zhang San ', ' physical ', 93)
INSERT into TB values (' John Doe ', ' language ', 74)
INSERT into TB values (' John Doe ', ' math ', 84)
INSERT into TB values (' John Doe ', ' physical ', 94)
Go

--sql SERVER 2000 Static SQL, refers to the course only language, mathematics, physics, this course. (hereinafter)
Select name as name,
Max (case course when ' language ' then score else 0 end) language,
Max (case course when ' math ' then fraction else 0 end) Math,
Max (case course when ' physical ' then fraction else 0 end) physical
From TB
Group BY name==================================================================================================== seems to be writing SQL statement when the execution order of the various keywords is not clear, often the organization of SQL statements lack good logic, with the feeling of "patchwork" (sorry, if your SQL statements are often "pieced together", then you have to reflect on it?).
This is really cool yourself, can be bitter machine, the server also needs in our disorganized SQL statement to find its next sentence to execute where the keyword.
Efficiency, because our sensory nerves are not sensitive to changes in the second, it is considered that the SQL order of self-written is irrelevant, "nothing changes anyway!" In fact, the server for each SQL parsing time will be detailed records, you can see the custom written by the SQL and the standard sequence of the SQ L How big the time difference is.
Therefore, we recommend that you work in peacetime SQL statements in accordance with the standard sequence of writing, one is professional, the second is practical, hehe, but I think the main thing is to feel comfortable in the heart.
The parsing order of standard SQL is:
(1). FROM clause, assemble data from different data sources
(2). WHERE clause to filter records based on specified criteria
(3). GROUP BY clause, dividing data into multiple groupings
(4). Using aggregate functions for calculations
(5). Use the HAVING clause to filter the grouping
(6). Calculate All expressions
(7). Use order by to sort the result set
Example: In the Student score table (Tb_grade), the "Examinee name" content is not empty records according to "examinee name" group, and filter the results of the group, select "Total Score" greater than 600 points.
The standard sequential SQL statements are:
Select Candidate Name, Max (total) as Max Total
From Tb_grade
Where examinee name is not NULL
Group BY candidate name
Having Max (total) > 600
ORDER BY Max Total
In the example above, the SQL statements are executed in the following order:
(1). First execute the FROM clause to assemble data from the Tb_grade table from the data source
(2). Execute a WHERE clause to filter data that is not NULL for all data in the Tb_grade table
(3). Execute the GROUP BY clause to group the Tb_grade table by the Student Name column
(4). Calculate the max () aggregate function and the maximum number of the total scores by "total"
(5). The HAVING clause is executed, and the total score of the screening course is greater than 600.
(7). Execute the ORDER BY clause to sort the final results by "Max score". =========================================================================== ========================== order of execution of query statements in SQLServer2005

--1.from
--2.on
--3.outer (Join)
--4.where
--5.group by
--6.cube|rollup
--7.having
--8.select
--9.distinct
--10.order by
--11.top


1. Logical query Processing step number
(8) SELECT (9) DISTINCT (one) <TOP_specification> <select_list>
(1) From <left_table>
(3) <join_type> join <right_table>
(2) on <join_condition>
(4) WHERE <where_condition>
(5) GROUP by <group_by_list>
(6) with {CUBE | ROLLUP}
(7) Having (10) ORDER by <order_by_list>

Each step produces a virtual table that is used as input to the next step.
Only the table that is generated in the last step is returned to the caller.
If there is no clause, skip the appropriate step.
1. From:
Performs a Cartesian product on the first two tables in the FROM clause, generating the virtual table VT1.
2. On:
Apply an on filter to VT1. Only those lines that make <join_condition> true are inserted into the VT2.
3. OUTER (JOIN):
If outer JOIN is specified, a matching row in the reserved table is added to VT2 as an outer row, generating VT3.
If the FROM clause contains more than two tables, repeat steps 1 through 3 for the result table and the next table that you generated for the previous join until you have finished processing all the tables.
4. Apply the where filter to the VT3. Only rows that make <where_condition> true are inserted into the VT4.
5. GROUP by:
Groups the rows in VT4 by the list of columns in the GROUP BY clause, generating VT5.
6. cube| ROLLUP:
Insert a super-group into VT5 to generate VT6.
7. Having:
Apply a have filter to VT6.
Only groups that make

Note: Having cannot be used alone, having clauses is a filter for the records after grouping, so there must be a group by
8. SELECT:
Processes the select list, producing VT8.
9. DISTINCT:
The duplicate rows are removed from the VT8, resulting in VT9.
Ten. ORDER by:
The rows in VT9 are sorted by the list of columns in the ORDER BY clause, and a table (VC10) is generated.
TOP: Selects the specified number or scale of rows from the beginning of the VC10, generates the table VT11, and returns it to the caller.

Note: Top N can achieve paging

Select top * FROM employee------first page

Select Top * FROM Employees

Where ID no. in (SELECT Top 20 ID number from Employee)------second page

2. Preparing the data

SET NOCOUNT on;
Use tempdb;
GO
 
IF object_id (' dbo. Orders ') is not NULL
DROP TABLE dbo. Orders;
GO
IF object_id (' dbo. Customers ') is not NULL
DROP TABLE dbo. Customers;
GO
 
CREATE TABLE dbo. Customers
(
CustomerID CHAR (5) Not NULL PRIMARY KEY,
City VARCHAR (Ten) not NULL
);
 
INSERT into dbo. Customers (customerid,city) VALUES (' Fissa ', ' Madrid ');
INSERT into dbo. Customers (customerid,city) VALUES (' Frndo ', ' Madrid ');
INSERT into dbo. Customers (customerid,city) VALUES (' Krlos ', ' Madrid ');
INSERT into dbo. Customers (customerid,city) VALUES (' mrphs ', ' Zion ');
 
CREATE TABLE dbo. Orders
(
OrderID INT not NULL PRIMARY KEY,
CustomerID CHAR (5) NULL REFERENCES Customers (CustomerID)
);
 
INSERT into dbo. Orders (OrderID, CustomerID) VALUES (1, ' Frndo ');
INSERT into dbo. Orders (OrderID, CustomerID) VALUES (2, ' Frndo ');
INSERT into dbo. Orders (OrderID, CustomerID) VALUES (3, ' Krlos ');
INSERT into dbo. Orders (OrderID, CustomerID) VALUES (4, ' Krlos ');
INSERT into dbo. Orders (OrderID, CustomerID) VALUES (5, ' Krlos ');
INSERT into dbo. Orders (OrderID, CustomerID) VALUES (6, ' mrphs ');
INSERT into dbo. Orders (OrderID, CustomerID) VALUES (7, NULL);

Execution Result:

3. Query statements

USE tempdb;
GO
SELECT C.customerid, COUNT(O.orderid) AS numorders
FROM dbo.Customers AS C
 LEFT OUTER JOIN dbo.Orders AS O
    ON C.customerid = O.customerid
WHERE C.city = ‘Madrid‘
GROUP BY C.customerid
HAVING COUNT(O.orderid) < 3
ORDER BY numorders;

Execution Result:

4. Detailed logic Query processing steps

1. Perform Cartesian product to form VT1. If the left table contains n rows and the right table contains m rows, VT1 will contain the NXM rows.

Execution result VT1:

2. Apply the on filter, only those rows that are <join_condition> True will be included in the VT2.

ON C.customerid = O.customerid

Three-valued Logic:

TRUE, FALSE, and unknown are possible values for logical expressions in SQL.

Unknown values typically appear in logical expressions with null values, such as null > 42; NULL = NULL; X + NULL > Y.

Not TRUE equals FALSE

Not FALSE equals True

Not UNKNOWN equals UNKNOWN

All query filters, such as on, where, have the unknown look as false processing.

The unknown value in the check constraint is treated as true. If the table contains a check constraint that requires that the value of the salary column must be greater than 0, inserting a row that is salary null can be accepted.

Unique constraints, sort operations, and grouping operations consider two null values to be equal. For example, there is a column in the table that defines a unique constraint, and you cannot insert two rows with NULL for that column value into the table. The GROUP BY clause groups all null values. The Orderb by clause arranges all the null values together.

Add the result of the on filter to VT1 VT2:

  

3. Add an outer row, and you can mark the left table, right table, and all tables as reserved tables by specifying a outer JOIN in leave, OK, and full. Setting a table to a reserved table indicates that all rows of the table are returned, even if <join_condition> has already performed a filter. These rows in the reserved table are called outer rows, and the properties of the non-reserved table in the outer row are given null, and the VT3 is generated last:

  

4. Apply the Where filter, only rows that meet <where_condition> will become part of the VT4. Because the data is not yet grouped, you cannot use an aggregation filter, such as WHERE OrderDate = MAX (OrderDate). You cannot also drink aliases in the select list because the select list is not processed at this time, such as select Year (OrderDate) as OrderYear WHERE orderyear > 2000.

For a query that contains a OUTER join clause, how do you determine whether the logical expression is specified on the on filter or in the Where filter: On is applied before the outer row is added, where it is applied after the outer row is added. The on filter is not the final one in the central branch of the reserved table, because the steps to add the outer row are also performed, and the where filter is final for the removal of the rows.

The ON and WHERE clauses only have this logical limit when using an outer join, and when using an inner join, it doesn't matter where you specify the logical expression, because there is no step 3 above.

WHERE C.city = ‘Madrid‘

To generate a virtual table VT4:

  

5. Grouping. Each unique value of the column list in the GROUP BY clause is combined into a group that generates VT5:

Groups

Raw

C.customerid

Fissa

Frndo

Krlos

VT5 consists of two parts: the Group section and the raw sections.

If a GROUP BY clause is specified in the query, then all subsequent steps (such as having, SELECT) can only specify an expression that is a scalar value that may be obtained as a group. That is, the result of an expression is a column/expression (such as: C.customer) or aggregate function (for example: COUNT (O.orderid)) in the Group by list. This limitation is because the final result set contains only one row for each group.

At this stage, two null are considered equal. All null values are assigned to a group.

If group by all is specified, the group that is removed in the where filter is added to VT5, and the original part is an empty collection. In a later step, the result of applying the count aggregate function to the group is 0, and the result of applying the other aggregate functions is null. It is best not to use GROUP by all.

  

6. Using the cube or Rollup option, a hyper-group is created and added to the virtual table returned in the previous step, generating the VT6.

  

7. Apply the having filter, only groups that meet

HAVING COUNT(O.orderid) < 3

Count (O.orderid) is used here instead of Count (*), so the outer row is not counted in count because O.orderid is null. The count (O.orderid) for this group, such as Fissa, is 0.

The red part is a group that has been filtered out.

Groups

Raw

C.customerid

Fissa

Frndo

Krlos

  

8. Process the select list and apply an alias to an expression that is not a base column so that it has a name in the result table. Aliases created in the select list cannot be used in the previous steps, or even in the select list, and can only be used in order by.

SELECT C.customerid, COUNT(O.orderid) AS numorders

Generate VT8:

Logically, all operations should be assumed to occur simultaneously.

  

9. Apply the DISTINCT clause, if the DISTINCT clause is specified in the query, remove the duplicate rows from the virtual table returned in the previous step and generate the virtual table VT9. Using GROUP BY, the use of distinct is superfluous.

  

10. Apply the ORDER BY clause to return the cursor VC10 by sorting the rows returned in the previous step in the list of columns in the ORDER BY clause. Only this step can use the Select alias. If you specify an expression in the Distinct,order by clause to access only the virtual table returned in the previous step, you can only sort by the column that you have selected.

The support for order by is enhanced in ANSI SQL 1999, allowing access to input virtual tables and output virtual tables in the Select phase. That is, if distinct is not specified, any expression that can be used in the SELECT clause can be specified in the ORDER BY clause, and can be sorted by an expression that does not exist in the final result set.

ORDER BY numorders;

You can also specify the ordinal of the result column in the select list in the ORDER BY clause:

ORDER BY 2, 1;

But try not to do so, because it may change the select list but forget to modify the order by list, and when the select list is long, it is not a good way to check the number.

Because this step returns a cursor instead of returning a table, a query that uses the ORDER BY clause cannot be used as a table expression. Table expressions include: views, inline table-valued functions, subqueries, derived tables, and common table expressions (CTE).

Do not assume the order of the rows in the table, unless you do need an ordered row, do not specify an ORDER BY clause. Sorting is a cost, and SQL Server needs to perform an ordered index scan or use sort operators.

ORDER by this step considers that two null are equal, all null is aligned, and ANSI does not specify whether NULL is higher or lower than the known value, but instead leaves the problem to a specific implementation, where the null rank in T-SQL is lower than the known value.

ORDER BY numorders

The returned cursor VC10:

 

11. Apply the top option, select the specified number of rows from the front of the cursor, generate the table VT11 and return it to the caller. In SQL Server 2000, the input of top must be a constant, and in 2005 it can be any independent expression.

Without an ORDER BY clause or with TIES option, the rows returned are exactly the first rows that are physically accessed, and may produce different results.

A query with an ORDER BY clause can be used in a table expression only if the top option is specified:

SELECT *
FROM (SELECT TOP 100 PERCENT orderid, customerid
        FROM dbo.Orders
        ORDER BY orderid) AS D;

Various stages in SQL Server query processing (SQL execution order)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.