1.1.1 Summary
Join is one of the important operations of the relational database system. Common Join Operations in SQL Server include internal Join, external Join, and cross Join. If we want to obtain data from two or more tables that matches rows in one table with rows in another table, we should consider using Join, because Join queries a specific joined table or function
This article will introduce the features and usage of various common joins in SQL through examples:
1.1.2 text
First, we define three tables in tempdb: College, Student, and Apply. The specific SQL code is as follows:
Copy codeThe Code is as follows:
USE tempdb
---- If database exists the same name datatable deletes it.
If exists (SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'text') drop table College;
If exists (SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'student ') drop table Student;
If exists (SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'application') drop table Apply;
---- Create Database.
Create table College (cName nvarchar (50), state text, enrollment int );
Create table Student (sID int, sName nvarchar (50), GPA real, sizeHS int );
Create table Apply (sID int, cName nvarchar (50), major nvarchar (50), demo-text );
Inner join
Inner join is one of the most common join types. It queries data that meets the join predicates.
Suppose we want to query the information related to the applied School in the application form. We cannot predict that the Apply table contains the school name, so we can use the cName to join the school name (Inner join) the College and Apply tables are used to find the school information contained in the Apply table.
The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Gets college information from college table
---- Bases on college name.
Select distinct College. cName, College. enrollment
FROM College INNER JOIN
Apply ON College. cName = Apply. cName
Figure 1 query results
CName |
State |
Enrollment |
Stanford |
CA |
15000 |
Berkeley |
CA |
36000 |
MIT |
MA |
10000 |
Cornell |
NY |
21000 |
Harvard |
MA |
29000 |
Table 1 data in the College table
As shown in 1, we have queried the school information contained in the Apply table. As Harvard is not found, we know that no student has applied for Harvard.
Inner join satisfies the exchange law: "A inner join B" and "B inner join A" are equal.
Outer join
Suppose we want to see all the school information. Even for schools that have not applied for it (such as Harvard), we can use the Outer join to query the information. Because the external join stores all rows in one or two input tables, even the rows matching the join predicates cannot be found.
The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Gets all college information
SELECT College. cName, College. state, College. enrollment,
Apply. cName, Apply. major, Apply. demo-
FROM College LEFT OUTER JOIN
Figure 3 left join query results
As shown in figure 3: No student applies for Harvard in the Apply table, but we use left outer join to query all school information.
Because left outer join produces the full set of the College table, the matched values in the Apply table have values, and those that do not match are replaced by NULL values, so we know that no student applies for Harvard in the Apply table.
We can use the left join query to obtain the full set of the College. If we want to obtain both the full set of the College and the full set of Apply, we can consider using full outer join ). Using a complete external join, we can query all schools, regardless of whether they match the join predicates:
Copy codeThe Code is as follows:
---- Gets all information from college and apply table.
SELECT College. cName, College. state, College. enrollment,
Apply. cName, Apply. major, Apply. demo-
FROM College FULL OUTER JOIN
Apply ON College. cName = Apply. cName
Figure 3 complete external join query results
Now we have obtained the complete dataset of College and Apply, and there is a value for matching in the table, even if no matching cName is found, it will be replaced by NULL.
The following table shows the data rows retained when each external join matches:
Join type |
Retain data rows |
A left outer join B |
All A rows |
A right outer join B |
All B rows |
A full outer join B |
All A and B rows |
Table 2 external join reserved data rows
Full outer join satisfies the exchange law: "A full outer join B" and "B full outer join A" are equal.
Cross join
Cross join performs the Cartesian product of two tables (that is, to combine the data of table A and table B with N * M ). That is to say, it matches each row in one table and the other. We cannot specify a predicate by using the ON clause in the cross join. Although we can use the WHERE clause to achieve the same result, this is essentially an internal join.
Cross-join is inferior to internal join usage, and two large tables should not be joined, because this will lead to a very expensive operation and a very large result set.
The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- College Cross join Apply.
SELECT College. cName, College. state, College. enrollment,
Apply. cName, Apply. major, Apply. demo-
FROM College
Cross join Apply
Figure 4 number of rows in the College and Apply tables
Figure 5 cross join
Now we perform a cross join on the College and Apply tables and generate a Cartesian product of the number of rows in the Data behavior College and Apply tables, that is, 5*20 = 100.
Cross apply
Cross apply is provided in SQL Server 2005 so that the table can be joined with the results of the table-valued functions TVF's function. For example, if we want to query the result value of the function and the table Student, we can use Cross apply to query:
Copy codeThe Code is as follows:
---- Creates a function to get data from Apply base on sID.
Create function dbo. fn_Apply (@ sID int)
RETURNS @ Apply TABLE (cName nvarchar (50), major nvarchar (50 ))
AS
BEGIN
INSERT @ Apply SELECT cName, major FROM Apply where [sID] = @ sID
RETURN
END
---- Student cross apply function fn_Apply.
SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student cross apply dbo. fn_Apply ([sID])
We can also use the internal connection to implement the same query function as Cross apply. The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Student inner join Apply bases on sID.
SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student inner join [Apply]
ON Student. sID = [Apply]. sID
Figure 6 Cross apply Query
Outer apply
After introducing Cross apply and Outer join, it is not difficult to understand Out apply. Outer apply enables the table to be used with table-valued functions (TVF's) the result is a join query. If a matching value is found, there is a value. If no matching value is found, it is expressed as NULL.
Copy codeThe Code is as follows:
---- Student outer apply function fn_Apply.
SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student outer apply dbo. fn_Apply ([sID])
Figure 7 Outer apply Query
Differences between Inner Join and Cross apply
First, we know that Inner join is a join query between tables, while Cross apply is a join query between tables and Table value functions. In the previous Cross apply example, we can also use Inner join to implement the same query.
Copy codeThe Code is as follows:
---- Student cross apply function fn_Apply.
SET STATISTICS PROFILE ON
SET STATISTICS TIME ON
SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student cross apply dbo. fn_Apply ([sID])
SET STATISTICS PROFILE OFF
SET STATISTICS TIME OFF
---- Student inner join Apply base on sID.
SET STATISTICS PROFILE ON
SET STATISTICS TIME ON
SELECT Student. sName, Student. GPA, Student. sizeHS,
CName, major
FROM Student inner join [Apply]
ON Student. sID = [Apply]. sID
SET STATISTICS PROFILE OFF
Set statistics time OFFCross apply
Query execution time:
CPU time = 0 ms, occupied time = 11 ms.
Inner join query execution time:
CPU time = 0 ms, occupied time = 4 ms.
Figure 8 execution plan
As shown in Figure 8: Cross apply first executes TVF (table-valued functions), then scans the full table of Studnet, and then searches for matching values by traversing the sID.
Inner join performs a full table scan on the Student and Apply tables, and searches for matched sID values through hash matching.
Through the preceding SQL Execution time and execution plan, can we say that Inner join is better than Cross apply? The answer is no. If the table has a large amount of data, the full table scan of Inner join will increase in time and CPU resources (you can test it through a table with a large amount of data ).
Although most queries implemented using Cross apply can be implemented through Inner join, Cross apply may produce better execution plans and better performance because it can restrict the join of a set before the join is executed.
Semi-join and Anti-semi-join
Semi-join performs an incomplete join query between the rows returned from one table and the data rows in the other table (if the matching data rows are found, the query is returned and no longer performed ).
Anti-semi-join performs an incomplete join query between the rows returned from one table and the data rows in another table, and then returns unmatched data.
Unlike other join operations, Semi-join and Anti-semi-join are not implemented using explicit syntax, however, Semi-join and Anti-semi-join have many application scenarios in SQL Server. We can use the EXISTS subquery to implement Semi-join queries, Not EXISTS to implement Anti-semi-join. Now let's explain it through specific examples!
Suppose we need to find the Student information that matches the sID in the Apply and Student tables. This is the same as the preceding Inner join query results. The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Student Semi-join Apply base on sID.
SELECT Student. sName, Student. GPA, Student. sizeHS
---- [Apply]. cName, [Apply]. major
FROM Student
WHERE exists (
SELECT *
From [Apply]
Where [Apply]. sID = Student. sID
)
We found that the commonly used EXISTS clause was originally implemented through Left Semi Join. Therefore, Semi-join is widely used in SQL Server.
Figure 9 query results
Figure 10 execution plan
Now we are asked to find out the student information that has NOT yet applied for the school. At this time, we immediately reflected that we can use the not exists clause to implement this query. The specific SQL code is as follows:
Copy codeThe Code is as follows:
---- Gets student still not apply for school.
SELECT Student. sID, Student. sName, Student. GPA, Student. sizeHS
---- [Apply]. cName, [Apply]. major
FROM Student
Where not exists (
SELECT *
FROM [Apply]
WHERE [Apply]. sID = Student. sID
)
In fact, the Common Implementation of the not exists clause is through Anti-semi-join. Through the execution plan, we find that when the query matches the sID, the SQL statement uses Left Anti Semi Join for query.
Figure 11 query results
Figure 12 execution plan
1.1.3 Summary
This article describes the application scenarios and features of join queries in SQL: Inner join, Outer join, Cross join, and Cross apply.