Some summary of SQL join

Source: Internet
Author: User
Tags joins sql using

1.1.1 Summary

Join is one of the important operations of a relational database system, including common joins in SQL Server: INNER JOIN, outer join, and Cross join. If we want to get data in two or more tables that match rows from one table to rows in another table, then we should consider using join, because the join specifically joins the table or function to query the attributes

This article will introduce the characteristics and usage of various common joins in SQL using specific examples:

Directory
    • Inner Join
    • Outer Join
    • Cross Join
    • Cross Apply
    • The difference between cross apply and Inner join
    • Semi-join and Anti-semi-join
1.1.2 Body

First we define three tables college, student, and apply separately in tempdb, with the following SQL code:

Use tempdb----IF database exists the same name DataTable deletes it. IF EXISTS (SELECT table_name from INFORMATION_SCHEMA. TABLES WHERE table_name = ' College ') DROP TABLE college;if EXISTS (SELECT table_name from INFORMATION_SCHEMA. TABLES WHERE table_name = ' Student ') DROP TABLE student;if EXISTS (SELECT table_name from INFORMATION_SCHEMA. TABLES WHERE table_name = ' Apply ') DROP table apply;----Create database.create table College (cName nvarchar (), State te XT, enrollment int), CREATE TABLE Student (SID int, SName nvarchar (), GPA Real, Sizehs int); CREATE table Apply (SID int, CN Ame nvarchar (+), major nvarchar (+), decision text);
Inner Join

An inner join (Inner join) is one of the most commonly used join types, and it queries the data that satisfies the join predicate.

Suppose we want to check the application form apply for the school's information, because the Apply table contains the school name we can not predict, so we can be based on the CNAME join (Inner join) Table College and apply, This will find information about the school included in the Apply table.

The specific SQL code is as follows:

----Gets College information from college table----bases on college name. SELECT DISTINCT college.cname, college.enrollmentfrom  College INNER JOIN        Apply on college.cname = Apply.cname

Figure 1 Query Results

/tr> TD valign= "Top" width= "133" >ny
cname state enrollment
stanford ca 15000
berkeley ca 36000
mit ma 10000
cornell 21000
harvard ma 29000

Table 1 data in the College table

As shown in 1, we have queried the school information contained in the Apply table, as Harvard has not been queried, so we know that there are no students to apply for Harvard.

An inner join (Inner join) satisfies the commutative law: "A Inner join B" and "B Inner join a" are equal.

Outer Join

Let's say we want to see all the school information, even those that don't apply (e.g. Harvard), and we can use an outer join (Outer join) to query. Because an outer join holds all rows of one or two input tables, even if a row matching the join predicate cannot be found.

The specific SQL code is as follows:

----Gets All college informationselect College.cname, College.state, College.enrollment,apply.cname, Apply.major, Apply.decisionfrom  College left OUTER joins        Apply on college.cname = Apply.cname

Figure 2 LEFT JOIN query results

As shown in 3: Since no student has applied for Harvard in the Apply table, we have queried all school information through the left outer JOIN.

Since the left outer JOIN produces a full set of table college, and matches in the Apply table have values, and mismatches are substituted with null values, we know that there is no student application Harvard in the Apply table.

With a left join query we can get a complete set of college, assuming that now we both get the full set of college and get the full set of apply, then we can consider using a full outer join (The complete outer join). With a full outer join , we can query all schools, regardless of whether they match the join predicate:

----Gets all information from college and apply table. SELECT College.cname, College.state, College.enrollment,apply.cname, Apply.major, Apply.decisionfrom  College Full OUTER JOIN        

Figure 3 Full outer join query results

Now that we have the full data set for college and apply, there is a value for the match in the table, even if no matching cname is found, the null value is substituted.

The following table shows the cases where data rows are persisted when each outer join (outer join) matches:

Join type

Keep data rows

A LEFT OUTER join B

All A rows

A right outer join B

All B rows

A full outer JOIN B

All A and B rows

Table 2 outer joins preserve data rows

A complete outer join (full outer join) satisfies the commutative law: "A outer join B" and "B full outer join a" are equal.

Cross Join

A cross join performs a Cartesian product of two tables (which is a n*m combination of the data from table A and table B). That is, it matches each row in a table with another table; we cannot specify a predicate by using an ON clause in a cross join , although we can use the WHERE clause to achieve the same result, which is that the cross join is basically an inner join up.

Cross joins are less efficient than internal joins , and two large tables should not cross-join , as this will result in a very expensive operation and a very large result set.

The specific SQL code is as follows:

----College Cross Join Apply.select college.cname, College.state, College.enrollment,apply.cname, Apply.major, Apply.decisionfrom Collegecross JOIN Apply

Figure 4 Number of rows in the college table and the Apply table

Figure 5 Cross Join

Now we cross join the college and apply tables, and the Cartesian product that generates the data behavior College and the row number of the Apply table is 5 * 20 = 100.

Cross Apply

Cross apply is provided in SQL Server 2005 so that tables can join queries with table-valued functions (table-valued functions TVF ' s) results. For example, now we want to query with the result value of the function and the table student, when we can use cross apply to query:

----Creates a function to get data from the Apply base on Sid.create function dbo.fn_apply (@sID int) RETURNS @Apply TABLE (cNa Me nvarchar, major nvarchar () asbegin  INSERT @Apply SELECT cName, Major from Apply where [sID] = @sID  RETURN    END----Student cross apply function Fn_apply.select Student.sname, Student.gpa, Student.sizehs,cname, major from St Udent Cross APPLY dbo.fn_apply ([SID])

We can also use the internal join implementation and cross apply the same query function, the specific SQL code is as follows:

----Student INNER JOIN Apply bases on Sid.select student.sname, Student.gpa, Student.sizehs,cname, major from Student INN ER JOIN [Apply]on student.sid = [Apply].sid

Figure 6 Cross Apply query

Outer Apply

After introducing cross apply and Outer join, it is not difficult for us now to understand out apply, Outer apply enables the table to join query with table-valued function (table-valued functions TVF ' s) results, If a match is found, the value is not found, and the match value is null.

----Student outer apply function Fn_apply.select student.sname, Student.gpa, Student.sizehs,cname, major from Student out ER APPLY dbo.fn_apply ([SID])

Figure 7 Outer Apply Query

The difference between Inner join and cross apply

First we know that the inner join is a join query for tables and tables, and cross apply is a join query for table and table-valued functions, and in the previous crosstab application we can implement the same query through the inner join.

----Student CROSS apply function Fn_apply.set STATISTICS profile ONSET STATISTICS time ONSELECT student.sname, STUDENT.GP A, Student.sizehs,cname, Major from Student cross APPLY dbo.fn_apply ([SID]) SET STATISTICS profile OFFSET STATISTICS time O Ff
----Student INNER JOIN Apply base on sid.set STATISTICS profile ONSET STATISTICS time ONSELECT student.sname, Student.gpa , Student.sizehs,cname, Major from Student INNER JOIN [Apply]on student.sid = [apply].sidset STATISTICS profile OFFSET STA Tistics Time OFF

Cross apply query execution time:

CPU time = 0 milliseconds, elapsed time = 11 milliseconds.

Inner Join query Execution time:

CPU time = 0 milliseconds, elapsed time = 4 milliseconds.

Figure 8 Execution plan

8: Cross Apply first executes TVF (table-valued functions), then performs a full table scan of the table studnet, and then finds the matching values by traversing the SID.

The Inner join makes a full table scan of the table student and apply, and then finds the matching SID value through hash matching.

With the above SQL execution time and execution plan, can we say that inner join is better than cross apply? The answer is no, if the table has a large amount of data, the full table scan of the inner join is time-consuming and CPU-intensive (it can be tested by a table with large data volumes).

Although most queries implemented with cross apply can be implemented through the inner join, a better execution plan and improved performance can be achieved with the result of a pass, because it allows the collection to be restricted before the join executes.

Semi-join and Anti-semi-join

Semi-join the rows returned from one table are not fully joined to the data rows in another table (the lookup to a matching row of data returns and no longer finds).

Anti-semi-join the rows returned from one table are not fully joined to the data rows in the other table, and then return unmatched data.

Unlike other join operations, Semi-join and Anti-semi-join have no explicit syntax to implement, but Semi-join and Anti-semi-join have many applications in SQL Server. We can use exists to implement Semi-join query, not exists to implement Anti-semi-join. Now let's take a concrete example to illustrate it!

Suppose we were asked to find the student information for the SID match in the Apply and student tables, which would be the same as the results of the previous inner join query, with the following SQL code:

----Student Semi-join apply base on Sid.select student.sname, Student.gpa, student.sizehs----[Apply].cname, [Apply]. Major from Studentwhere exists (    SELECT * from    [Apply]    where [apply].sid = Student.sid)

We found that the common EXISTS clause was originally implemented through the left Semi join, so Semi-join is a lot of use in SQL Server.

Figure 9 Query Results

Figure 10 Execution plan

Now we are asked to find out the student information that has not yet been applied to the school, when we immediately react we can use the NOT EXISTS clause to implement the query, the SQL code is as follows:

----Gets student still not apply for school. SELECT Student.sid, Student.sname, Student.gpa, student.sizehs----[Apply].cname, [Apply].major from Studentwhere not EXISTS (    SELECT * from    [Apply]    WHERE [apply].sid = Student.sid)

In fact, our common implementation of the NOT EXISTS clause is through Anti-semi-join, where we find that when a matching SID is found, SQL uses the left Anti semi join to query.

Figure 11 Query Results

Figure 12 Execution plan

1.1.3 Summary

This article describes the uses and features of join queries that are commonly used in sql: Inner join, Outer join, cross join, and cross apply.

Series Blog Navigation

    • Some summary of deadlock
    • Some summary of SQL transcation

Reference

    • Http://coolshell.cn/articles/3463.html
    • Http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
    • Http://blogs.msdn.com/b/craigfr/archive/2006/07/19/671712.aspx

SQL code

    • Http://files.cnblogs.com/rush/Sample.rar
    • [Source]: http://www.cnblogs.com/rush/

Some summary of SQL join

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.