1.1.1 Summary
Join is one of the most important operations of relational database systems, including common joins in SQL Server: inner joins, outer joins, and Cross joins. If we want to get data in two or more tables that match rows from one table to rows in another, then we should consider using joins because the attributes of a join that joins a table or function for querying
This article will take a concrete example to introduce the characteristics and use of various common joins in SQL:
1.1.2 Body
First we define three tables college, student, and apply in tempdb, with the following SQL code:
Copy Code code as follows:
Use tempdb
----IF Database exists the same name DataTable deletes it.
IF EXISTS (SELECT table_name from INFORMATION_SCHEMA. TABLES WHERE table_name = ' College ') DROP TABLE College;
IF EXISTS (SELECT table_name from INFORMATION_SCHEMA. TABLES WHERE table_name = ' Student ') DROP TABLE Student;
IF EXISTS (SELECT table_name from INFORMATION_SCHEMA. TABLES WHERE table_name = ' Apply ') DROP TABLE Apply;
----Create Database.
CREATE table College (CName nvarchar (), state text, enrollment int);
CREATE TABLE Student (sID int, sname nvarchar (), GPA Real, Sizehs int);
CREATE TABLE Apply (sID int, cName nvarchar (), major nvarchar (), decision text);
Inner Join
An inner join (Inner join) is one of the most commonly used join types, and it queries the data that satisfies the join predicate.
Suppose we are looking for information about the application in applying for the school, and since the application table contains the school name we are not able to predict, we can find the information that contains the school in the Apply table based on the CNAME join (Inner join) Table College and apply.
The specific SQL code is as follows:
Copy Code code as follows:
----Gets College information from college table
----bases on college name.
SELECT DISTINCT College.cname, college.enrollment
From College INNER JOIN
Apply on college.cname = Apply.cname
Figure 1 Query Results
CName |
State |
Enrollment |
Stanford |
Ca |
15000 |
Berkeley |
Ca |
36000 |
MIT |
MA |
10000 |
Cornell |
NY |
21000 |
Harvard |
MA |
29000 |
Table 1 College The data in the table
As shown in Figure 1 above, we query the school information contained in the Apply table, as Harvard is not queried, so we know that there is no student application for Harvard.
The INNER join (Inner join) satisfies the commutative law: "A Inner join B" and "B Inner join A" are equal.
Outer Join
Suppose we want to see all the school information; even those schools that are not applying (e.g. Harvard), then we can use an outer join (Outer join) to query. Because an outer join holds all rows of one or two input tables, even rows that match the JOIN predicate cannot be found.
The specific SQL code is as follows:
Copy Code code as follows:
----Gets All college information
SELECT College.cname, College.state, College.enrollment,
Apply.cname, Apply.major, apply.decision
From College left OUTER JOIN
Figure 3 Left-Join query results
As shown in Figure 3 above: Because there are no students applying for Harvard in the Apply table, we query all the school information through the left outer join.
Because the left join produces the complete set of the table college, the match in the Apply table has a value, and the mismatch is replaced with a null value, so we know that there are no students applying for Harvard in the Apply table outer.
We can get the complete set of college by the left join query, assuming that now we have to get both the full set of college and the full set of apply, then we can consider using a full outer join (the fully outer join). With full outer joins, we can query all schools, regardless of whether they match join predicates:
Copy Code code as follows:
----Gets all information from college and apply table.
SELECT College.cname, College.state, College.enrollment,
Apply.cname, Apply.major, apply.decision
From College full OUTER JOIN
Apply on college.cname = Apply.cname
Figure 3 Full outer join query results
Now we have a full dataset of college and apply, and a value for the match in the table, even if no matching cname is found, replaced with a null value.
The following table shows the retention of data rows when each outer join (outer join) matches:
Join type |
Keep data rows |
A LEFT OUTER join B |
All A rows |
A right outer join B |
All B rows |
A full outer JOIN B |
All A and B rows |
Table 2 outer joins preserve data rows
The complete outer join (full outer join) satisfies the Exchange law: "A fully outer join B" is equal to "B full outer join A".
Cross Join
A cross join (cross join) performs a Cartesian product of two tables (that is, a n*m combination of table A and Table B data). That is, it matches one table with each row in another table; we cannot specify predicates by using the ON clause in a cross join, although we can use the WHERE clause to achieve the same result, which is that the cross join is essentially an inner join.
Cross joins are less used than internal joins, and two large tables should not cross join because it will result in a very expensive operation and a very large result set.
The specific SQL code is as follows:
Copy Code code as follows:
----College Cross join Apply.
SELECT College.cname, College.state, College.enrollment,
Apply.cname, Apply.major, apply.decision
From College
CROSS JOIN Apply
Figure 4 The number of rows college the table and the Apply table
Figure 5 Cross Join
We now cross-join the college and apply tables, and generate data behavior College and the Cartesian product of the Apply table row Number 5 * 20 = 100.
Cross Apply
Cross apply is provided in SQL Server 2005 to enable tables to join queries with table-valued functions (table-valued functions TVF ' s) results. For example, now we want to query by the result value of the function and the table student, at which point we can use cross apply to query:
Copy Code code as follows:
----Creates a function to get data from the Apply base on SID.
CREATE FUNCTION dbo.fn_apply (@sID int)
RETURNS @Apply TABLE (cName nvarchar), major nvarchar (50))
As
BEGIN
INSERT @Apply SELECT CName, Major from Apply where [sID] = @sID
Return
End
----Student CROSS apply function fn_apply.
SELECT Student.sname, Student.gpa, Student.sizehs,
CName, Major
From Student CROSS APPLY dbo.fn_apply ([SID])
We can also use the internal join implementation and cross apply the same query function, the specific SQL code is as follows:
Copy Code code as follows:
----Student INNER JOIN Apply bases on SID.
SELECT Student.sname, Student.gpa, Student.sizehs,
CName, Major
From Student INNER JOIN [Apply]
On student.sid = [Apply].sid
Figure 6 Cross Apply Query
Outer Apply
After introducing cross apply and Outer join, now let us understand out apply is also not difficult, Outer apply so that the table can and table-valued functions (table-valued functions TVF ' s) the results of the join query, If a match is found, there is a value, and a matching value is not found to indicate null.
Copy Code code as follows:
----Student outer apply function fn_apply.
SELECT Student.sname, Student.gpa, Student.sizehs,
CName, Major
From Student OUTER APPLY dbo.fn_apply ([SID])
Figure 7 Outer Apply Query
the difference between Inner join and cross apply
First we know that inner join is a join query for tables and tables, and cross apply is a join query for table and table-valued functions, and in the previous cross apply example, we can also implement the same query through inner join.
Copy Code code as follows:
----Student CROSS apply function fn_apply.
SET STATISTICS profile on
SET STATISTICS time on
SELECT Student.sname, Student.gpa, Student.sizehs,
CName, Major
From Student CROSS APPLY dbo.fn_apply ([SID])
SET STATISTICS profile off
SET STATISTICS Time off
----Student INNER JOIN Apply base on SID.
SET STATISTICS profile on
SET STATISTICS time on
SELECT Student.sname, Student.gpa, Student.sizehs,
CName, Major
From Student INNER JOIN [Apply]
On student.sid = [Apply].sid
SET STATISTICS profile off
SET STATISTICS Time Offcross apply
Query execution Time:
CPU time = 0 milliseconds, elapsed time = 11 milliseconds.
Inner Join query Execution time:
CPU time = 0 milliseconds, elapsed time = 4 milliseconds.
Figure 8 Implementation Plan
As shown in Figure 8: Cross Apply First execute TVF (table-valued functions), then perform a full table scan of the table studnet, then find the matching value by traversing the SID.
The Inner join scans the table student and apply for a full table, and then finds the matching SID value through a hash match.
Through the above SQL execution time and execution plan, can we say inner join is better than cross apply? The answer is no, if the table has a large amount of data, the full table scan of the inner join takes time and CPU resources to increase (it can be tested by a large data table).
Although most queries using the cross apply implementation can be implemented through a inner join, cross apply can produce better execution planning and performance, because it can restrict collection joins before the join is executed.
Semi-join and Anti-semi-join
Semi-join the rows returned from one table to the rows in another table that are not fully joined to the query (they are returned with a matching row of data, and no further lookups are found).
Anti-semi-join the rows returned from one table to the data row in the other table, and then returns the mismatched data.
Unlike other join operations, Semi-join and Anti-semi-join have no explicit syntax to implement, but Semi-join and anti-semi-join have multiple applications in SQL Server. We can use exists to implement Semi-join queries, not exists to implement Anti-semi-join. Now let's take a concrete example to illustrate it!
Suppose we are asked to find the student information that matches the SID in the Apply and student tables, and this is the same as the previous inner join query results, as follows:
Copy Code code as follows:
----Student Semi-join Apply base on SID.
SELECT Student.sname, Student.gpa, Student.sizehs
----[Apply].cname, [Apply].major
From Student
WHERE exists (
SELECT *
from [Apply]
where [APPLY].SID = Student.sid
)
We found that the commonly used exists clauses were implemented through the left Semi join, so there are many other uses in SQL Server that semi-join.
Figure 9 Query Results
Figure 10 Implementation Plan
Now we are asked to find out the information of the students who have not yet applied for the school, then we react immediately. You can use the NOT EXISTS clause to implement the query, with the following specific SQL code:
Copy Code code as follows:
----Gets student still not apply for school.
SELECT Student.sid, Student.sname, Student.gpa, Student.sizehs
----[Apply].cname, [Apply].major
From Student
WHERE not EXISTS (
SELECT *
from [Apply]
WHERE [APPLY].SID = Student.sid
)
In fact, the implementation of our usual NOT EXISTS clause is through anti-semi-join, and we find that SQL uses the left Anti semi join to query when looking for a matching SID.
Figure 11 Query Results
Figure 12 Implementation Plan
1.1.3 Summary
This article describes the common use of join queries in sql: Inner join, Outer join, Cross join, and Cross apply.