SQL join Summary

Source: Internet
Author: User
1.1.1 Summary

Join is one of the important operations of the relational database system. Common Join Operations in SQL Server include internal join, external join, and cross join. If we want to obtain data from two or more tables that matches rows in one table with rows in another table, we should consider using join, because join queries a specific joined table or function

This article will introduce the features and usage of various common joins in SQL through examples:

Directory
    • Inner join
    • Outer Join
    • Cross join
    • Cross apply
    • Differences between cross apply and inner join
    • Semi-join and anti-semi-join
1.1.2 text

First, we define three tables in tempdb: College, student, and apply.CodeAs follows:

 Use Tempdb ---- If database exists the Same Name datatable deletes it.  If  Exists (  Select Table_name From Information_schema.tables  Where Table_name =  'College'  )  Drop table College ;  If  Exists (  Select Table_name From  Information_schema.tables  Where Table_name =  'Student'  ) Drop table Student ;  If  Exists (  Select Table_name From  Information_schema.tables  Where Table_name =  'Application'  )  Drop table  Apply;  ---- Create database.  Create Table College ( Cname Nvarchar  ( 50 ),  State text  , Enrollment Int  );  Create Table Student (  Sid int  , Sname Nvarchar  ( 50 ), GPA Real  , Sizehs Int  );  Create Table  Apply (  Sid int  , Cname Nvarchar  ( 50 ), Major Nvarchar  ( 50 ), Demo- Text  ); 
Inner join

Inner join(Inner join) is one of the most common join types. It queries data that meets the join predicates.

Suppose we want to query the information about the applied School in the application form. Because the apply table contains the school name, we cannot predict it, so we can use the cnameInner join(Inner join) The College and apply tables to find the school information contained in the apply table.

The specific SQL code is as follows:

 
---- Gets college information from college table ---- bases on college name.Select distinctCollege.Cname,College.EnrollmentFromCollegeInner join applyOnCollege.Cname= Apply.Cname

Figure 1 query results

cname state enrollment
Stanford Ca 15000
Berkeley Ca 36000
mit MA 10000
Cornell ny 21000
Harvard MA 29000

Table 1 data in the college table

As shown in 1, we have queried the school information contained in the apply table. As Harvard is not found, we know that no student has applied for Harvard.

Inner join(Inner join) satisfies the exchange law: "A inner join B" and "B inner join a" are equal.

Outer Join

Suppose we want to see all the school information; even for schools that have not been applied for (such as Harvard), we can useExternal join(Outer Join. BecauseExternal joinStores all rows in one or two input tables, even if the rows matching the join predicates cannot be found.

The specific SQL code is as follows:

---- Gets all college informationSelectCollege.Cname,College.State,College.Enrollment, Apply.Cname, Apply.Major, Apply.Demo-FromCollegeLeft Outer Join applyOnCollege.Cname= Apply.Cname

Figure 3 left join query results

As shown in figure 3: No student applies for Harvard in the apply table, but we passLeft join(Left Outer Join) queries all school information.

BecauseLeft join(Left Outer Join) generates the full set of the college table, while the apply table matches with values, and the unmatched values are replaced by null values, so we know that no student applies for Harvard in the apply table.

PassLeft joinWe can query the full set of the college. If we want to obtain both the full set of the College and the full set of apply, we can consider usingComplete External Connection(Full outer join ). UseComplete External Connection, We can query all schools, whether or not they match the join predicates:

---- Gets all information from college and apply table.SelectCollege.Cname,College.State,College.Enrollment, Apply.Cname, Apply.Major, Apply.Demo-FromCollegeFull outer join applyOnCollege.Cname= Apply.Cname

Figure 3 complete external join query results

Now we have obtained the complete dataset of college and apply, and there is a value for matching in the table, even if no matching cname is found, it will be replaced by null.

The following table showsExternal join(Outer Join) retained data rows during Matching:

Join type

Retain data rows

A left Outer Join B

All a rows

A right Outer Join B

All B rows

A full outer join B

All a and B rows

Table 2 external join reserved data rows

Complete External Connection(Full outer join) satisfies the exchange law: "A full outer join B" and "B full outer join a" are equal.

Cross join

Cross join(Cross join) executes the Cartesian product of two tables (that is, the combination of data in tables A and B in a n * m ). That is to say, it matches each row in one table and the other. We cannot use the on clauseCross joinSpecify the predicate. Although we can use the WHERE clause to implement the same result, this isCross joinBasically asInternal Connection.

Cross joinRelativeInternal ConnectionLow usage, and two large tables should not be usedCross joinBecause it will lead to a very expensive operation and a very large result set.

The specific SQL code is as follows:

 
---- College cross join apply.SelectCollege.Cname,College.State,College.Enrollment, Apply.Cname, Apply.Major, Apply.Demo-FromCollegeCross join apply
 

Figure 4 number of rows in the College and apply tables

Figure 5 cross join

Now we perform a cross join on the college and apply tables and generate a Cartesian product of the number of rows in the Data behavior College and apply tables, that is, 5*20 = 100.

Cross apply

Cross apply is provided in SQL Server 2005 so that the table can be joined with the results of the table-valued functions tvf's function. For example, if we want to query the result value of the function and the table student, we can use cross apply to query:

 ---- Creates a function to get data from apply base on Sid.  Create Function DBO . Fn_apply ( @ Sid Int  )  Returns @ Apply Table ( Cname Nvarchar  ( 50 ), Major Nvarchar  ( 50 ))  Asbegin insert @ Apply Select Cname , Major From  Apply  Where [Sid] = @ Sid Return end ---- Student cross apply function fn_apply.  Select Student . Sname , Student . GPA , Student . Sizehs , Cname , Major From Student Cross apply DBO . Fn_apply ( [Sid]) 

We can also useInternal ConnectionImplement the same query function as cross apply. The specific SQL code is as follows:

 
---- Student inner join apply bases on Sid.SelectStudent.Sname,Student.GPA,Student.Sizehs,Cname,MajorFromStudentInner join[Apply]OnStudent.Sid=[Apply].Sid

Figure 6 cross apply Query

Outer apply

After introducing cross apply and outer join, it is not difficult to understand out apply. Outer apply enables the table to be used with table-valued functions (tvf's) the result is a join query. If a matching value is found, there is a value. If no matching value is found, it is expressed as null.

 
---- Student outer apply function fn_apply.SelectStudent.Sname,Student.GPA,Student.Sizehs,Cname,MajorFromStudentOuter applyDBO.Fn_apply([Sid])
 
 

Figure 7 outer apply Query

Differences between inner join and cross apply

First, we know that inner join is a join query between tables, while cross apply is a join query between tables and Table value functions. In the previous cross apply example, we can also use inner join to implement the same query.

 
---- Student cross apply function fn_apply.Set statistics profile onset statistics time onselectStudent.Sname,Student.GPA,Student.Sizehs,Cname,MajorFromStudentCross applyDBO.Fn_apply([Sid])Set statistics profile offset statistics time off
---- Student inner join apply base on Sid.Set statistics profile onset statistics time onselectStudent.Sname,Student.GPA,Student.Sizehs,Cname,MajorFromStudentInner join[Apply]OnStudent.Sid=[Apply].SidSet statistics profile offset statistics time off

Cross apply query execution time:

CPU time = 0 ms, occupied time = 11 ms.

Inner join query execution time:

CPU time = 0 ms, occupied time = 4 ms.

Figure 8 execution plan

As shown in Figure 8: Cross apply first executes tvf (Table-valued functions), then scans the full table of studnet, and then searches for matching values by traversing the SID.

Inner join performs a full table scan on the student and apply tables, and searches for matched Sid values through hash matching.

Through the preceding SQL Execution time and execution plan, can we say that inner join is better than cross apply? The answer is no. If the table has a large amount of data, the full table scan of inner join will increase in time and CPU resources (you can test it through a table with a large amount of data ).

Although most queries implemented using cross apply can be implemented through inner join, cross apply may produce better execution plans and better performance because it can restrict the join of a set before the join is executed.

Semi-join and anti-semi-join

Semi-join performs an incomplete join query between the rows returned from one table and the data rows in the other table (if the matching data rows are found, the query is returned and no longer performed ).

Anti-semi-join performs an incomplete join query between the rows returned from one table and the data rows in another table, and then returns unmatched data.

Unlike other join operations, semi-join and anti-semi-join are not implemented using explicit syntax, however, semi-join and anti-semi-join have many application scenarios in SQL Server. We can use the exists subquery to implement semi-join queries, not exists to implement anti-semi-join. Now let's explain it through specific examples!

Suppose we need to find the student information that matches the SID in the apply and student tables. This is the same as the preceding inner join query results. The specific SQL code is as follows:

---- Student semi-join apply base on Sid.SelectStudent.Sname,Student.GPA,Student.Sizehs---- [Apply]. cname, [apply]. MajorFromStudentWhereExists (Select*From[Apply]Where[Apply].Sid=Student.Sid)

We found that the commonly used exists clause was originally implemented through left semi join. Therefore, semi-join is widely used in SQL Server.

Figure 9 query results

Figure 10 execution plan

Now we are asked to find out the student information that has not yet applied for the school. At this time, we immediately reflected that we can use the not exists clause to implement this query. The specific SQL code is as follows:

  ---- gets student still not apply for school.   select  Student .  Sid , Student .  sname , Student .  GPA , Student .  sizehs  ---- [apply]. cname, [apply]. major   from  Student  where   not exists (  select   *   from  [apply]  where  [apply] .  Sid = Student .  Sid )  

In fact, the Common Implementation of the not exists clause is through anti-semi-join. Through the execution plan, we find that when the query matches the Sid, the SQL statement uses left anti semi join for query.

Figure 11 query results

Figure 12 execution plan

1.1.3 Summary

This article describes the application scenarios and features of join queries in SQL: inner join, Outer Join, cross join, and cross apply.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.