SQL Join Summary (example)

Source: Internet
Author: User

1.1.1 Summary
Join is one of the important operations of the relational database system. Common Join Operations in SQL Server include internal Join, external Join, and cross Join. If we want to obtain data from two or more tables that matches rows in one table with rows in another table, we should consider using Join, because Join queries a specific joined table or function

This article will introduce the features and usage of various common joins in SQL through examples:

1.1.2 text
First, we define three tables in tempdb: College, Student, and Apply. The specific SQL code is as follows:
Copy codeThe Code is as follows:
USE tempdb

---- If database exists the same name datatable deletes it.
If exists (SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'text') drop table College;
If exists (SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'student ') drop table Student;
If exists (SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'application') drop table Apply;

---- Create Database.
Create table College (cName nvarchar (50), state text, enrollment int );
Create table Student (sID int, sName nvarchar (50), GPA real, sizeHS int );
Create table Apply (sID int, cName nvarchar (50), major nvarchar (50), demo-text );


Inner join

Inner join is one of the most common join types. It queries data that meets the join predicates.

Suppose we want to query the information related to the applied School in the application form. We cannot predict that the Apply table contains the school name, so we can use the cName to join the school name (Inner join) the College and Apply tables are used to find the school information contained in the Apply table.

The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Gets college information from college table
---- Bases on college name.
Select distinct College. cName, College. enrollment
FROM College INNER JOIN
Apply ON College. cName = Apply. cName

Figure 1 query results
CName State Enrollment
Stanford CA 15000
Berkeley CA 36000
MIT MA 10000
Cornell NY 21000
Harvard MA 29000
Table 1 data in the College table

As shown in 1, we have queried the school information contained in the Apply table. As Harvard is not found, we know that no student has applied for Harvard.

Inner join satisfies the exchange law: "A inner join B" and "B inner join A" are equal.

Outer join
Suppose we want to see all the school information. Even for schools that have not applied for it (such as Harvard), we can use the Outer join to query the information. Because the external join stores all rows in one or two input tables, even the rows matching the join predicates cannot be found.

The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Gets all college information
SELECT College. cName, College. state, College. enrollment,
Apply. cName, Apply. major, Apply. demo-
FROM College LEFT OUTER JOIN

Figure 3 left join query results

As shown in figure 3: No student applies for Harvard in the Apply table, but we use left outer join to query all school information.

Because left outer join produces the full set of the College table, the matched values in the Apply table have values, and those that do not match are replaced by NULL values, so we know that no student applies for Harvard in the Apply table.

We can use the left join query to obtain the full set of the College. If we want to obtain both the full set of the College and the full set of Apply, we can consider using full outer join ). Using a complete external join, we can query all schools, regardless of whether they match the join predicates:
Copy codeThe Code is as follows:
---- Gets all information from college and apply table.
SELECT College. cName, College. state, College. enrollment,
Apply. cName, Apply. major, Apply. demo-
FROM College FULL OUTER JOIN
Apply ON College. cName = Apply. cName

Figure 3 complete external join query results

Now we have obtained the complete dataset of College and Apply, and there is a value for matching in the table, even if no matching cName is found, it will be replaced by NULL.

The following table shows the data rows retained when each external join matches:

Join type

Retain data rows

A left outer join B

All A rows

A right outer join B

All B rows

A full outer join B

All A and B rows


Table 2 external join reserved data rows

Full outer join satisfies the exchange law: "A full outer join B" and "B full outer join A" are equal.

Cross join
Cross join performs the Cartesian product of two tables (that is, to combine the data of table A and table B with N * M ). That is to say, it matches each row in one table and the other. We cannot specify a predicate by using the ON clause in the cross join. Although we can use the WHERE clause to achieve the same result, this is essentially an internal join.

Cross-join is inferior to internal join usage, and two large tables should not be joined, because this will lead to a very expensive operation and a very large result set.

The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- College Cross join Apply.
SELECT College. cName, College. state, College. enrollment,
Apply. cName, Apply. major, Apply. demo-
FROM College
Cross join Apply

Figure 4 number of rows in the College and Apply tables 

Figure 5 cross join

Now we perform a cross join on the College and Apply tables and generate a Cartesian product of the number of rows in the Data behavior College and Apply tables, that is, 5*20 = 100.

Cross apply
Cross apply is provided in SQL Server 2005 so that the table can be joined with the results of the table-valued functions TVF's function. For example, if we want to query the result value of the function and the table Student, we can use Cross apply to query:
Copy codeThe Code is as follows:
---- Creates a function to get data from Apply base on sID.
Create function dbo. fn_Apply (@ sID int)
RETURNS @ Apply TABLE (cName nvarchar (50), major nvarchar (50 ))
AS
BEGIN
INSERT @ Apply SELECT cName, major FROM Apply where [sID] = @ sID
RETURN
END

---- Student cross apply function fn_Apply.
SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student cross apply dbo. fn_Apply ([sID])

We can also use the internal connection to implement the same query function as Cross apply. The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Student inner join Apply bases on sID.
SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student inner join [Apply]
ON Student. sID = [Apply]. sID


Figure 6 Cross apply Query

Outer apply
After introducing Cross apply and Outer join, it is not difficult to understand Out apply. Outer apply enables the table to be used with table-valued functions (TVF's) the result is a join query. If a matching value is found, there is a value. If no matching value is found, it is expressed as NULL.
Copy codeThe Code is as follows:
---- Student outer apply function fn_Apply.
SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student outer apply dbo. fn_Apply ([sID])

Figure 7 Outer apply Query 


Differences between Inner Join and Cross apply

First, we know that Inner join is a join query between tables, while Cross apply is a join query between tables and Table value functions. In the previous Cross apply example, we can also use Inner join to implement the same query.
Copy codeThe Code is as follows:
---- Student cross apply function fn_Apply.
SET STATISTICS PROFILE ON
SET STATISTICS TIME ON

SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student cross apply dbo. fn_Apply ([sID])

SET STATISTICS PROFILE OFF
SET STATISTICS TIME OFF

---- Student inner join Apply base on sID.
SET STATISTICS PROFILE ON
SET STATISTICS TIME ON

SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student inner join [Apply]
ON Student. sID = [Apply]. sID

SET STATISTICS PROFILE OFF
Set statistics time OFFCross apply

Query execution time:

CPU time = 0 ms, occupied time = 11 ms.

Inner join query execution time:

CPU time = 0 ms, occupied time = 4 ms.

Figure 8 execution plan

As shown in Figure 8: Cross apply first executes TVF (table-valued functions), then scans the full table of Studnet, and then searches for matching values by traversing the sID.

Inner join performs a full table scan on the Student and Apply tables, and searches for matched sID values through hash matching.

Through the preceding SQL Execution time and execution plan, can we say that Inner join is better than Cross apply? The answer is no. If the table has a large amount of data, the full table scan of Inner join will increase in time and CPU resources (you can test it through a table with a large amount of data ).

Although most queries implemented using Cross apply can be implemented through Inner join, Cross apply may produce better execution plans and better performance because it can restrict the join of a set before the join is executed.

Semi-join and Anti-semi-join

Semi-join performs an incomplete join query between the rows returned from one table and the data rows in the other table (if the matching data rows are found, the query is returned and no longer performed ).

Anti-semi-join performs an incomplete join query between the rows returned from one table and the data rows in another table, and then returns unmatched data.

Unlike other join operations, Semi-join and Anti-semi-join are not implemented using explicit syntax, however, Semi-join and Anti-semi-join have many application scenarios in SQL Server. We can use the EXISTS subquery to implement Semi-join queries, Not EXISTS to implement Anti-semi-join. Now let's explain it through specific examples!

Suppose we need to find the Student information that matches the sID in the Apply and Student tables. This is the same as the preceding Inner join query results. The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Student Semi-join Apply base on sID.
SELECT Student. sName, Student. GPA, Student. sizeHS
---- [Apply]. cName, [Apply]. major
FROM Student
WHERE exists (
SELECT *
From [Apply]
Where [Apply]. sID = Student. sID
)

We found that the commonly used EXISTS clause was originally implemented through Left Semi Join. Therefore, Semi-join is widely used in SQL Server.


Figure 9 query results



Figure 10 execution plan

Now we are asked to find out the student information that has NOT yet applied for the school. At this time, we immediately reflected that we can use the not exists clause to implement this query. The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Gets student still not apply for school.
SELECT Student. sID, Student. sName, Student. GPA, Student. sizeHS
---- [Apply]. cName, [Apply]. major
FROM Student
Where not exists (
SELECT *
FROM [Apply]
WHERE [Apply]. sID = Student. sID
)

In fact, the Common Implementation of the not exists clause is through Anti-semi-join. Through the execution plan, we find that when the query matches the sID, the SQL statement uses Left Anti Semi Join for query.

Figure 11 query results


Figure 12 execution plan

1.1.3 Summary
This article describes the application scenarios and features of join queries in SQL: Inner join, Outer join, Cross join, and Cross apply.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.