Joins and Apps

Source: Internet
Author: User
Tags joins

In a relational database system, in order to satisfy the third normal form (3NF), it is necessary to separate the table that satisfies the "transitive dependency" into a separate table and connect the related table through the join clause, there are three types of join clauses: outer connection, inner connection, cross connection; outer join: Left JOIN, right Join, full join; the inner link is: Inner join, cross join IS.

One, the composition of the JOIN clause

The join clause consists of a join table, a connection type, and an ON clause, with the following pseudo-code:

From Left_table[inner|left|right|full] Join right_table    [on condition]

1, according to the location, the two tables participating in the join are divided into left and right tables

    • In the join clause, the left and right tables perform a Cartesian set operation, and any row in the left table is "combined" with all the rows in the right table, generating virtual tables (virtual table), the total number of data rows for the dummy table, rows (VT) =rows (left_table ) *rows (right_table);
    • Left table and right table join operation, there is no order, this and the Apply clause, the left table of the Apply clause before the right table to perform the operation;

2, Connection type

In an outer join, the left,right and full keywords identify the "reserved table" for the join clause: when an outer join query is made, the data in the reserved table is returned all and is not filtered by the ON clause.

3,on clause, for both virtual tables for filtering

In an ON clause expression, the commonly used operators are equal (=), can also use the unequal (>,<>), like and other operators, the returned result is a Boolean value;

The operand of an ON clause expression, which can be a table column, a constant, an expression, for example;

    • On Left_table.column=right_table.column
    • On Left_table.column=value
    • On Left_table.column+xx=value
    • Do not filter: For example set on 1=1

The 4,on clause determines the order of joins

If a query contains more than one join clause, the ON clause determines the order in which the JOIN clause executes, and the order in which the join is performed is: TB and TC perform the connection operation before the TA and TB.

From Taleft join Tbleft join TC on    tb.column=tc.column on    ta.column=tb.column

5,on clause filtering and WHERE clause filtering

The ON clause is executed before the WHERE clause, and the ON clause cannot filter the reserved table when filtering, but the WHERE clause can filter the reserved table;

For inner join, because there are no reserved tables, filtering in the ON clause and the WHERE clause results in the same result, but it is recommended to explicitly distinguish between the functions of the WHERE clause and the ON clause, which is used to filter the connected virtual table, where is used to filter the final result set.

For example: In the ON clause, Ta.column2=value1 does not filter the left table TA, and if the condition is not met, the corresponding data column of the right table is set to the Null,left keyword to ensure that all rows of data in the left table are returned; WHERE clause (ta.column3= value2) filter left table ta;

From Taleft join TB on    ta.column1=ta.column1 and        ta.column2=value1where ta.column3=value2

Second, create the test code

View Code

Three, left join (outer connection)

1,left Join algorithm

The left table as a reserved table, returns all the data from the left table, and returns null for the right table that does not match the ON clause condition data row;

SELECT * FROM Dbo.ta a LEFT join DBO.TB B on   a.a=b.ca

2, using constants to filter the left table

In the left outer join, the left table returns all data, and for "and Left_table.column=value", the returned result is filtered when the first condition is established, while the left table data is returned all, and the right table data is null when the condition is not met;

SELECT * FROM Dbo.ta a left joins DBO.TB B on   a.a=b.ca and A.a=1

3, using the WHERE clause to filter the left table

The WHERE clause is the last filter that filters the result set

SELECT * FROM Dbo.ta a left joins DBO.TB B on   a.a=b.ca where a.a=1

4, use the WHERE clause to filter the right table

If you use the WHERE clause to filter the right table, you can generally convert to a inner join

SELECT * FROM Dbo.ta a left joins DBO.TB B on a.a=b.ca where b.ca=1

Four, right join (left OUTER join)

The RIGHT join algorithm displays the data in the right table as a reserved table, and sets its field value to NULL for rows that are not matched in the left table.

SELECT * from Dbo.ta a right joins DBO.TB B on a.a=b.ca

Five, inner join (inner connection)

The algorithm is: the inner join does not have a table, only the rows of data that meet the conditions of the ON clause are returned, and for rows that do not meet the conditions of the ON clause.

SELECT * from Dbo.ta a inner joins DBO.TB B on a.a=b.ca

Six, full join (fully connected)

The algorithm is: Full joins both the left and right tables as reserved tables, if the data rows in the left table and right tables satisfy the ON clause condition, then the data row data is displayed, and if it does not match, the corresponding field is set to null.

SELECT * FROM Dbo.ta a full join DBO.TB B on a.a=b.ca

Seven, cross join (crossover connection)

The algorithm is: Cross join is the Cartesian product of left table and cursors, the cross join does not have an ON clause, and the Cartesian product is a combination of any row of data in the left table and all rows of data in the right table, and the cross join displays the results of the Cartesian product directly.

SELECT * FROM Dbo.ta a cross join DBO.TB b

Eight, self-connect for cumulative sum

Self-join refers to a table and its own join, such as the following statement, table Dbo.ta and itself for inner join, calculate the accumulation of B fields and.

Select T1.a,sum (t2.b) as B from Dbo.ta T1 inner joins dbo.ta as T2 on t1.a>=t2.a GROUP by t1.a

In a real-world product environment, the calculation of cumulative sums is often performed using self-joins, such as a Table:dbo.FinanceMonth, where the yield per month is quantity, and the cumulative value of quantity for all months up to that month in a year is calculated.

View Code

Calculate a cumulative value using a self-link

Select A.monthnum,sum (b.quantity) as totalquantity from dbo. Financemonth a INNER JOIN dbo. Financemonth b on A.monthnum>=b.monthnum GROUP by A.monthnum ORDER by A.monthnum

Nine, apply usage

The difference between 1,join and apply

The calculation of the left and right tables of the JOIN clause is not in order, and it is better to consider the small table as the left table and reduce the time consuming of the query when the right table data is large. The left and right tables of the Apply clause are distinguished sequentially, and apply is the first to calculate the left table, then the right table, so the Apply clause is not a set operation statement. If the right table is a table-valued function, apply first takes the value of a row of records in the left table, passes it as a parameter value to the table-valued function for calculation, and a row of records in the left table and the "right table" do the Cartesian product as the final result. If the result of the right table query is empty, then the right field is set to null.

SELECT * FROM Dbo.ta a outer apply (SELECT * from DBO.TB b where a.a=b.ca) p

From the query results, with the left join is the same, but in terms of performance, outer apply than the left join is worse, because TSQL is good at set operation, the use of the idea of the collection of code performance is generally very high, left join is set operation statement, performance is better than outer Apply

While apply has a low performance, it also has its place, and apply is the best choice when you need to connect in sequence.

Two usages of 2,apply

Outer the same points for apply and cross apply are:

    • Calculate the left table first, then the right table;
    • For each row in the left table, the right table is calculated as "row by line", similar to the correlated subquery, in fact, after the TSQL has optimized the apply, it is not progressive, but rows by N;

Outer the different points of apply and cross apply are:

    • Outer apply: The left table is used as the reserved table, and if the right table has no matching rows, the fields in the right table are set to NULL, similar to the leave join.
    • Cross apply: There is no reserved table, and for a row of records in the left table, if there are no matching rows in the right table, then the row record does not appear in the final result set, similar to the inner join.
SELECT * from Dbo.ta a cross apply (SELECT * from DBO.TB b where a.a=b.ca) p

Ten, application of join statement

1, use Cross join to quickly generate large numbers of sequential numbers

The number of data rows in the result set of a cross join is the product of the left table data rows and the right table data rows, because each table has 10 numbers (from 0 to 9), and 4 table cross joins can quickly produce 10 4, or 10,000 sequential numbers.

; with Num as (selects N from (values (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)), as P (n)) Select a.n+b.n*10+c.n*100+d.n*1000 A s n--into dbo.num from num across join NUM b cross join NUM c cross join NUM D ORDER by n

2, use the LEFT JOIN to query data rows that do not exist in the right table

If the data in the left table does not exist in the right table, then the field on the right table is null, and you can query the data rows that exist in the left table but not in the right table by setting filter in the WHERE clause

SELECT * from Dbo.ta T1 left joins DBO.TB T2 on t1.a=t2.ca where t2.ca is null;

Joins and Apps

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.