Z Advanced T-SQL query--in-depth understanding of subqueries

Last Update:2014-08-22 Source: Internet

Author: User

Tags scalar

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

SQL has a very powerful and flexible way of querying, and multi-table join operations can often be replaced with subqueries, this article will cover all aspects of subqueries.

Introduction

A subquery is essentially a restricted SELECT statement that is nested into another select,update,insert,delete statement, in which only the following clauses can be used

Select clause (required)
FROM clause (required)
WHERE clause (optional)
GROUP by (optional)
Have (optional)
Order BY (available only when the top keyword is used)

Subqueries can also be nested in other subqueries, which can be up to 32 layers in a nested set. A subquery is also called an internal query (Inner query) or an internal selection (Inner Select), and a query statement that contains subqueries is also called an external query (outter) or an external selection (Outer Select), the concept of subqueries can be simply explained:

is a subquery that is used as a data source.

In general, a subquery can be divided into three types according to the type of data returned by the subquery, respectively:

Returns a data table (table)
Returns a list of values (column)
Returns a single value (Scalar)

Below, we follow these three ways to illustrate sub-query

subqueries are used as data sources

When a subquery is used after the FROM clause of an external query, the subquery is used as a data source , even if the subquery returns only a single value (Scalar) or a column value (column), which can still be considered a special A data source , a two-dimensional data table (table). A subquery that is used as a data source is much like a view , except that the subquery only exists temporarily and is not included in the database.

For example, this statement:

SELECT     P.productid, P.name, P.productnumber, m.name as Productmodelnamefrom         production.product as P INNER JOIN ( SELECT     Name, productmodelid from          Production.ProductModel) as M on p.productmodelid = M.productmodelid

The subquery statement above ProductModel a subset of M in the table as a data source (table) and a product table for internal connections. The results are as follows:

Using as a data source is also the simplest application for subqueries. Of course, when a subquery is used as a data source, it is also divided into correlated subqueries and unrelated subqueries , which are described later in the article.

subqueries are used as selection criteria

subqueries, which are the criteria for selection, are also the most complex of sub-query applications.

Subqueries that are selection criteria are those that return only a single column (column) , which, if used as a selection criterion, can be thought of as a column of only one row , even if it returns only one value . Like what:

In the AdventureWorks:

I would like to get a total of 68 hours of sick days for employees :

SELECT [FirstName]      , [MiddleName]      , [LastName] from  [adventureworks].[ Person]. [Contact]  WHERE ContactID in   (SELECT EmployeeID from  [adventureworks].[ HumanResources]. [Employee]  WHERE sickleavehours>68)

The results are as follows:

In the above query, the subquery that follows the IN keyword returns a column of values to use as the selection criteria for the external query .

Similarly, the not-in keyword, which is logically reversed from the In keyword, is no longer described here

However, it is important to emphasize that you should not use the in and not keyword, which can cause a lot of potential problems, and this article has a good explanation for this question: http://wiki.lessthandot.com/index.php/Subquery_typo_with_using_in. The point of this article is to never use the in and not keyword again, my point is that it is reasonable to be present, and I think it is only possible to use in and not in when inside is a fixed value, for example:

SELECT [FirstName]      , [MiddleName]      , [LastName] from  [adventureworks].[ Person]. [Contact]  WHERE ContactID in  (25,33)

Only in this case, using the in and not keywords is safe, in other cases it is best to use the Exists,not exists,join keyword instead. In addition to in, the keywords used to select criteria have any and all, and the two keywords are the same as their literal meanings. and "<", ">", "=" connection used, such as the above with in the subquery:

I'd like to get an employee who has a total of sick days greater than 68 hours

The equivalent query with the ANY keyword is:

SELECT [FirstName]      , [MiddleName]      , [LastName] from  [adventureworks].[ Person]. [Contact]  WHERE ContactID =any  (SELECT EmployeeID from  [adventureworks].[ HumanResources]. [Employee]  WHERE sickleavehours>68)

When used as the any and all keywords in a subquery, the effect is as follows

=any	and in equivalence
<>all	And not in equivalence
>any	Greater than the smallest (>min)
<any	Less than the maximum (<max)
>all	Greater than the maximum (>max)
<all	Less than the smallest (<min)
=all	Below say

The =all keyword is seldom used, and the effect in a subquery is equal to "=" If there is only one return value, and if there are multiple return values, the result is empty

It is important to note that SQL is a very flexible language, just as the effects of a subquery can be implemented using join (with the same effect, with different ideas), the effects of any and all can be replaced in other ways, as shown in the table above, >any and > Min is fully equivalent, such as the following two query statements are completely equivalent:

Select *from AdventureWorks.HumanResources.EmployeeWHERE sickleavehours>any (select Sickleavehours from AdventureWorks.HumanResources.Employee WHERE sickleavehours>68)
Select *from AdventureWorks.HumanResources.EmployeeWHERE sickleavehours> (select MIN (sickleavehours) from AdventureWorks.HumanResources.Employee WHERE sickleavehours>68)

correlated subqueries and EXISTS keywords

The query mentioned earlier is irrelevant subquery (uncorrelated subquery), there is a kind of important query in the subquery is the related subquery (correlated subquery), also known as a repeating subquery , for example, Or the above query, with the related subquery to write:

I would like to get a total of 68 days of sick days for employees :

SELECT [FirstName]      , [MiddleName]      , [LastName] from  [adventureworks].[ Person]. [Contact] C  WHERE EXISTS  (SELECT * from  [adventureworks].[ HumanResources]. [Employee] e  WHERE C.contactid=e.contactid and e.sickleavehours>68)

The results are the same as the query results that use the IN keyword:

How do you differentiate between correlated subqueries and unrelated subqueries ? The simplest way is to see whether the subquery itself can be executed, such as the subquery in the example above:

(SELECT *  from [AdventureWorks]. [HumanResources]. [Employee] e  WHERE C.contactid=e.contactid and e.sickleavehours>68)

The execution of this sentence itself will be an error. Because this sentence refers to a table of external queries

For unrelated subqueries, the entire query is executed only once for the subquery, and then to an external query, such as:

SELECT *from AdventureWorks.HumanResources.EmployeeWHERE Sickleavehours>anysqlresult

The above unrelated subquery, the entire query process can be considered as a subquery first return sqlresult (SQL result set), and then to external query use, the entire process subquery only once

Conversely, as a correlated subquery , the number of times the subquery executes depends on the external query, and the outer query executes one row at a time, such as:

Or the above example: I would like to get an employee who has a total of sick days greater than 68 days

SELECT [FirstName]      , [MiddleName]      , [LastName] from  [adventureworks].[ Person]. [Contact] C  WHERE EXISTS  (SELECT * from  [adventureworks].[ HumanResources]. [Employee] e  WHERE C.contactid=e.contactid and e.sickleavehours>68)----step 1:select [FirstName]      , [ MiddleName]      , [LastName] from  [adventureworks].[ Person]. [Contact] C  WHERE EXISTS  (SELECT * from  [adventureworks].[ HumanResources]. [Employee] e  WHERE 1=e.contactid and e.sickleavehours>68)----step 2:select [FirstName]      , [MiddleName]      , [LastName]  from [AdventureWorks]. [Person]. [Contact] C  WHERE EXISTS  (SELECT * from  [adventureworks].[ HumanResources]. [Employee] e  WHERE 2=e.contactid and e.sickleavehours>68)----step n:select [FirstName]      , [MiddleName]      , [LastName]  from [AdventureWorks]. [Person]. [Contact] C  WHERE EXISTS  (SELECT * from  [adventureworks].[ HumanResources]. [Employee] e  WHERE N=e.contactid and e.sickleavehours>68)

As shown in the code above. The related subquery above actually executes n times (n depends on the number of rows of the outer query), each row of the outer query passes the parameters used by the corresponding row to the subquery, and returns TRUE if the subquery has a corresponding value (both the current row is selected and displayed in the result), or false if not. Then repeat the next line.

subqueries are used as computed columns

When a subquery is used as a computed column , only a single value (Scalar) is returned. Used as a computed column after the SELECT statement. Also divided into correlated sub-queries and unrelated subqueries

Examples of correlated subqueries such as: I want to get the name of each product and the total sales

Select [Name],      (select COUNT (*) from AdventureWorks.Sales.SalesOrderDetail S      WHERE s.productid=p.productid) As Salesamountfrom [AdventureWorks]. [Production]. [Product] P

Some of the results are as follows:

When a subquery is used as a computed column , a unique value is returned for each row of the outer query .

Similarly, the SQL subquery can use other statements to achieve the same effect, the above statement and the following statement to achieve the same effect:

SELECT P.name,count (S.productid) from [AdventureWorks]. [Production]. [Product] P left joins  AdventureWorks.Sales.SalesOrderDetail SON S.productid=p.productidgroup by P.name

When a subquery is used as a computed column and is not an unrelated subquery, it is returned only once but a value, which is no longer elaborated here.

Summary

This article elaborates subqueries through three different uses of subqueries. At the same time, all sub-queries can be divided into correlated subqueries and unrelated subqueries, and the functions implemented by subqueries can be implemented using joins or other methods. But a good writer should be a rich vocabulary, not just to express what he means. Learning a variety of SQL queries is the only way to learn SQL queries.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More