1.1.1 Summary
Join is one of the important operations of the relational database system. Common Join Operations in SQL Server include internal join, external join, and cross join. If we want to obtain data from two or more tables that matches rows in one table with rows in another table, we should consider using join, because join queries a specific joined table or function
This article will introduce the features and usage of various common joins in SQL through examples:
Directory
- Inner join
- Outer Join
- Cross join
- Cross apply
- Differences between cross apply and inner join
- Semi-join and anti-semi-join
1.1.2 text
First, we define three tables in tempdb: College, student, and apply.CodeAs follows:
Use Tempdb ---- If database exists the Same Name datatable deletes it. If Exists ( Select Table_name From Information_schema.tables Where Table_name = 'College' ) Drop table College ; If Exists ( Select Table_name From Information_schema.tables Where Table_name = 'Student' ) Drop table Student ; If Exists ( Select Table_name From Information_schema.tables Where Table_name = 'Application' ) Drop table Apply; ---- Create database. Create Table College ( Cname Nvarchar ( 50 ), State text , Enrollment Int ); Create Table Student ( Sid int , Sname Nvarchar ( 50 ), GPA Real , Sizehs Int ); Create Table Apply ( Sid int , Cname Nvarchar ( 50 ), Major Nvarchar ( 50 ), Demo- Text );
Inner join
Inner join(Inner join) is one of the most common join types. It queries data that meets the join predicates.
Suppose we want to query the information about the applied School in the application form. Because the apply table contains the school name, we cannot predict it, so we can use the cnameInner join(Inner join) The College and apply tables to find the school information contained in the apply table.
The specific SQL code is as follows:
---- Gets college information from college table ---- bases on college name.Select distinctCollege.Cname,College.EnrollmentFromCollegeInner join applyOnCollege.Cname= Apply.Cname
Figure 1 query results
cname |
state |
enrollment |
Stanford |
Ca |
15000 |
Berkeley |
Ca |
36000 |
mit |
MA |
10000 |
Cornell |
ny |
21000 |
Harvard |
MA |
29000 |
Table 1 data in the college table
As shown in 1, we have queried the school information contained in the apply table. As Harvard is not found, we know that no student has applied for Harvard.
Inner join(Inner join) satisfies the exchange law: "A inner join B" and "B inner join a" are equal.
Outer Join
Suppose we want to see all the school information; even for schools that have not been applied for (such as Harvard), we can useExternal join(Outer Join. BecauseExternal joinStores all rows in one or two input tables, even if the rows matching the join predicates cannot be found.
The specific SQL code is as follows:
---- Gets all college informationSelectCollege.Cname,College.State,College.Enrollment, Apply.Cname, Apply.Major, Apply.Demo-FromCollegeLeft Outer Join applyOnCollege.Cname= Apply.Cname
Figure 3 left join query results
As shown in figure 3: No student applies for Harvard in the apply table, but we passLeft join(Left Outer Join) queries all school information.
BecauseLeft join(Left Outer Join) generates the full set of the college table, while the apply table matches with values, and the unmatched values are replaced by null values, so we know that no student applies for Harvard in the apply table.
PassLeft joinWe can query the full set of the college. If we want to obtain both the full set of the College and the full set of apply, we can consider usingComplete External Connection(Full outer join ). UseComplete External Connection, We can query all schools, whether or not they match the join predicates:
---- Gets all information from college and apply table.SelectCollege.Cname,College.State,College.Enrollment, Apply.Cname, Apply.Major, Apply.Demo-FromCollegeFull outer join applyOnCollege.Cname= Apply.Cname
Figure 3 complete external join query results
Now we have obtained the complete dataset of college and apply, and there is a value for matching in the table, even if no matching cname is found, it will be replaced by null.
The following table showsExternal join(Outer Join) retained data rows during Matching:
Join type |
Retain data rows |
A left Outer Join B |
All a rows |
A right Outer Join B |
All B rows |
A full outer join B |
All a and B rows |
Table 2 external join reserved data rows
Complete External Connection(Full outer join) satisfies the exchange law: "A full outer join B" and "B full outer join a" are equal.
Cross join
Cross join(Cross join) executes the Cartesian product of two tables (that is, the combination of data in tables A and B in a n * m ). That is to say, it matches each row in one table and the other. We cannot use the on clauseCross joinSpecify the predicate. Although we can use the WHERE clause to implement the same result, this isCross joinBasically asInternal Connection.
Cross joinRelativeInternal ConnectionLow usage, and two large tables should not be usedCross joinBecause it will lead to a very expensive operation and a very large result set.
The specific SQL code is as follows:
---- College cross join apply.SelectCollege.Cname,College.State,College.Enrollment, Apply.Cname, Apply.Major, Apply.Demo-FromCollegeCross join apply
Figure 4 number of rows in the College and apply tables
Figure 5 cross join
Now we perform a cross join on the college and apply tables and generate a Cartesian product of the number of rows in the Data behavior College and apply tables, that is, 5*20 = 100.
Cross apply
Cross apply is provided in SQL Server 2005 so that the table can be joined with the results of the table-valued functions tvf's function. For example, if we want to query the result value of the function and the table student, we can use cross apply to query:
---- Creates a function to get data from apply base on Sid. Create Function DBO . Fn_apply ( @ Sid Int ) Returns @ Apply Table ( Cname Nvarchar ( 50 ), Major Nvarchar ( 50 )) Asbegin insert @ Apply Select Cname , Major From Apply Where [Sid] = @ Sid Return end ---- Student cross apply function fn_apply. Select Student . Sname , Student . GPA , Student . Sizehs , Cname , Major From Student Cross apply DBO . Fn_apply ( [Sid])
We can also useInternal ConnectionImplement the same query function as cross apply. The specific SQL code is as follows:
---- Student inner join apply bases on Sid.SelectStudent.Sname,Student.GPA,Student.Sizehs,Cname,MajorFromStudentInner join[Apply]OnStudent.Sid=[Apply].Sid
Figure 6 cross apply Query
Outer apply
After introducing cross apply and outer join, it is not difficult to understand out apply. Outer apply enables the table to be used with table-valued functions (tvf's) the result is a join query. If a matching value is found, there is a value. If no matching value is found, it is expressed as null.
---- Student outer apply function fn_apply.SelectStudent.Sname,Student.GPA,Student.Sizehs,Cname,MajorFromStudentOuter applyDBO.Fn_apply([Sid])
Figure 7 outer apply Query
Differences between inner join and cross apply
First, we know that inner join is a join query between tables, while cross apply is a join query between tables and Table value functions. In the previous cross apply example, we can also use inner join to implement the same query.
---- Student cross apply function fn_apply.Set statistics profile onset statistics time onselectStudent.Sname,Student.GPA,Student.Sizehs,Cname,MajorFromStudentCross applyDBO.Fn_apply([Sid])Set statistics profile offset statistics time off
---- Student inner join apply base on Sid.Set statistics profile onset statistics time onselectStudent.Sname,Student.GPA,Student.Sizehs,Cname,MajorFromStudentInner join[Apply]OnStudent.Sid=[Apply].SidSet statistics profile offset statistics time off
Cross apply query execution time:
CPU time = 0 ms, occupied time = 11 ms.
Inner join query execution time:
CPU time = 0 ms, occupied time = 4 ms.
Figure 8 execution plan
As shown in Figure 8: Cross apply first executes tvf (Table-valued functions), then scans the full table of studnet, and then searches for matching values by traversing the SID.
Inner join performs a full table scan on the student and apply tables, and searches for matched Sid values through hash matching.
Through the preceding SQL Execution time and execution plan, can we say that inner join is better than cross apply? The answer is no. If the table has a large amount of data, the full table scan of inner join will increase in time and CPU resources (you can test it through a table with a large amount of data ).
Although most queries implemented using cross apply can be implemented through inner join, cross apply may produce better execution plans and better performance because it can restrict the join of a set before the join is executed.
Semi-join and anti-semi-join
Semi-join performs an incomplete join query between the rows returned from one table and the data rows in the other table (if the matching data rows are found, the query is returned and no longer performed ).
Anti-semi-join performs an incomplete join query between the rows returned from one table and the data rows in another table, and then returns unmatched data.
Unlike other join operations, semi-join and anti-semi-join are not implemented using explicit syntax, however, semi-join and anti-semi-join have many application scenarios in SQL Server. We can use the exists subquery to implement semi-join queries, not exists to implement anti-semi-join. Now let's explain it through specific examples!
Suppose we need to find the student information that matches the SID in the apply and student tables. This is the same as the preceding inner join query results. The specific SQL code is as follows:
---- Student semi-join apply base on Sid.SelectStudent.Sname,Student.GPA,Student.Sizehs---- [Apply]. cname, [apply]. MajorFromStudentWhereExists (Select*From[Apply]Where[Apply].Sid=Student.Sid)
We found that the commonly used exists clause was originally implemented through left semi join. Therefore, semi-join is widely used in SQL Server.
Figure 9 query results
Figure 10 execution plan
Now we are asked to find out the student information that has not yet applied for the school. At this time, we immediately reflected that we can use the not exists clause to implement this query. The specific SQL code is as follows:
---- gets student still not apply for school. select Student . Sid , Student . sname , Student . GPA , Student . sizehs ---- [apply]. cname, [apply]. major from Student where not exists ( select * from [apply] where [apply] . Sid = Student . Sid )
In fact, the Common Implementation of the not exists clause is through anti-semi-join. Through the execution plan, we find that when the query matches the Sid, the SQL statement uses left anti semi join for query.
Figure 11 query results
Figure 12 execution plan
1.1.3 Summary
This article describes the application scenarios and features of join queries in SQL: inner join, Outer Join, cross join, and cross apply.