International - English

Cart Console

Topic Center

Contact Sales

Home > Developer > Mobile Develop

INNER JOIN vs. Cross APPLY

Last Update:2014-08-31 Source: Internet

Author: User

Tags rowcount scalar

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Refer from:http://explainextended.com/2009/07/16/inner-join-vs-cross-apply/

INNER JOINUsed construct in SQL: It joins and together, selecting only those row combinations for which A JOIN condition is true.

This query:

SELECT  *from    table1join    table2on      table2.b = table1.a

Reads

For each row from table1 , select all rows from table2 where the value of field is equal to that b fielda

Note that this condition can is rewritten as this:

SELECT  *from    table1, table2where   table2.b = table1.a

, in which case it reads as following:

Make a set of all possible combinations of rows from and and of this table1 table2 set select only those rows where the Valu E of field is equal to that b fielda

These conditions is worded differently, but they yield the same result and database systems is aware of that. Usually both these queries is optimized to use the same execution plan.

The former syntax is called ANSI syntax, and it's generally considered more readable and are recommended to use.

However, it didn ' t get into the Oracle until recently, that's why there is many hardcore Oracle Developers That is just used to the latter syntax.

Actually, it ' s a matter of taste.

To use JOIN s (with whatever syntax), both sets is joining must be self-sufficient, I. E. The sets should not depend on all other. You can query both sets without ever knowing the contents on another set.

But for some tasks the sets is not self-sufficient. For instance, let ' s consider the following query:

We table table1 and table2 . Has table1 a column called rowcount .

For each row from table1 the We need to select first rowcount rows from table2 , ordered bytable2.id

We cannot formulate a join condition here. The join condition, should it exists, would involve the row number, which is isn't present table2 in, and there is no-to Calculate a row number is only from the values of columns of any given row in table2 .

That ' s where the CROSS APPLY can is used.

CROSS APPLYis a Microsoft's extension to SQL, which were originally intended to being used with table-valued functions (T VF' s).

The query above would look like this:

Select  *from    table1cross APPLY (select  TOP (table1.rowcount) *from    table2order Byid) T2

For each from table1 , select first table1.rowcount rows from table2 ordered byid

The sets here is not self-sufficient:the query uses values from table1 to define the second set, and not to with JOIN it.

The exact contents of t2 known until the corresponding row from is table1 selected.

I previously said that there is no-to join these-sets, which is true as long as we consider the sets as is. However, we can change the second set a little so, we get a additional computed field we can later join on.

The first option to do, just count all preceding rows in a subquery:

Select  *from    table1 t1join    (select  t2o.*, (select  COUNT (*) from    table2 t2iwhere   t2i.id <= t2o.id) as Rnfrom    table2 t2o) t2on      t2.rn <= t1.rowcount

The second option is to use a window function, also available in SQL Server since version 2005:

Select  *from    table1 t1join    (select  t2o.*, Row_number () over (ORDER by id) as Rnfrom    table2 t2o) T2on      t2.rn <= t1.rowcount

This functions returns the ordinal number a row would has being the ORDER BY condition used in the function applied to the Whol e query.

This was essentially the same result as the subquery used in the previous query.

Now, let's create the sample tables and check all these solutions for efficiency:

SET NOCOUNT ongodrop table [20090716_cross].table1drop table [20090716_cross].table2drop SCHEMA [20090716_cross] Gocreate SCHEMA [20090716_cross]create table table1 (ID INT NOT NULL PRIMARY Key,row_count int. NOT NULL) CREATE TABLE table2 (ID INT NOT NULL PRIMARY key,value VARCHAR (a) NOT null) Gobegin transactiondeclare @cnt Intset @cnt = 1WHILE @cnt <= 100000BEGININSERTINTO    [20090716_cross].table2 (ID, VA Lue) VALUES  (@cnt, ' Value ' + CAST (@cnt as VARCHAR)) SET @cnt = @cnt + 1ENDINSERTINTO    [20090716_cross].table1 (ID, R Ow_count) SELECT  TOP 5id, id% 2 + 1FROM    [20090716_cross].table2order Byidcommitgo

table2Contains 100,000 rows with sequential id s.

table1Contains the following:

ID	Row_count
1	2
2	1
3	2
4	1
5	2

Now let's run the first query (with COUNT ):

Select  *from    [20090716_cross].table1 t1join    (select  t2o.*, (select  COUNT (*) from    [ 20090716_cross].table2 t2iwhere   t2i.id <= t2o.id) as Rnfrom    [20090716_cross].table2 t2o] T2ON      t2.rn <= T1.row_countorder Byt1.id, t2.id

ID	Row_count	ID	value	RN
1	2	1	Value 1	1
1	2	2	Value 2	2
2	1	1	Value 1	1
3	2	1	Value 1	1
3	2	2	Value 2	2
4	1	1	Value 1	1
5	2	1	Value 1	1
5	2	2	Value 2	2
8 rows fetched in 0.0000s (498.4063s)

Table ' table1 '. Scan count 2, logical reads 200002, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB R Ead-ahead reads 0. Table ' worktable '.  Scan count 100000, logical reads 8389920, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB Read-ahead reads 0. Table ' worktable '. Scan count 0, logical reads 0, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB Read-a Head reads 0. Table ' table2 '. Scan Count 4, logical reads 1077, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB Rea D-ahead reads 0. SQL Server execution times:   CPU time = 947655 MS,

This query, as were expected, is very unoptimal. It runs for more than seconds.

Here's the query plan:

SELECT  Sort    Compute Scalar      Parallelism (Gather Streams)        Inner join (Nested Loops)          Inner Join ( Nested Loops)            Clustered Index Scan ([20090716_cross].[ Table2])            Compute Scalar              Stream Aggregate                Eager Spool                  Clustered Index Scan ([20090716_cross].[ Table2])          Clustered Index Scan ([20090716_cross].[ Table1])

For each row selected from table2 , it counts all previous rows again an again, never recording the intermediate result. The complexity of such an algorithm O(n^2) are, that's why it takes so long.

Let ' s run he second query, which uses ROW_NUMBER() :

Select  *from    [20090716_cross].table1 t1join    (select  t2o.*, Row_number () over (ORDER by id) as Rnfrom    [20090716_cross].table2 t2o] t2on      t2.rn <= t1.row_countorder byt1.id, t2.id

ID	Row_count	ID	value	RN
1	2	1	Value 1	1
1	2	2	Value 2	2
2	1	1	Value 1	1
3	2	1	Value 1	1
3	2	2	Value 2	2
4	1	1	Value 1	1
5	2	1	Value 1	1
5	2	2	Value 2	2
8 rows fetched in 0.0006s (0.5781s)

Table ' worktable '. Scan count 1, logical reads 214093, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB R Ead-ahead reads 0. Table ' table2 '. Scan count 1, logical reads 522, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB read -ahead reads 0. Table ' table1 '. Scan count 1, logical reads 2, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB Read-a Head reads 0. SQL Server execution times:   CPU time = 578 MS,

This was much faster, only 0.5 ms.

Let's look into the query plan:

SELECT  Inner Join (Nested Loops)    Clustered Index Scan ([20090716_cross].[ Table1])  Lazy Spool    Sequence Project (Compute scalar)      Compute scalar        Segment          Clustered Index Scan ([20090716_cross]. [Table2])

This was much better, since this query plan keeps the intermediate results while calculating the ROW_NUMBER .

However, it still calculates s for all of ROW_NUMBER 100,000 rows table2 in, then puts them to a temporary index over rn creat Ed Lazy Spool by, and uses this index in a nested loop to range the rn s for each row from table1 .

Calculating and indexing all ROW_NUMBER s are quite expensive, that's why we see 214,093 Logical reads in the query stat Istics.

Finally, let ' s try a CROSS APPLY :

Select  *from    [20090716_cross].table1 t1cross APPLY (SELECT  TOP (t1.row_count) *from    [20090716_ Cross].table2order Byid) T2order byt1.id, t2.id

ID	Row_count	ID	value
1	2	1	Value 1
1	2	2	Value 2
2	1	1	Value 1
3	2	1	Value 1
3	2	2	Value 2
4	1	1	Value 1
5	2	1	Value 1
5	2	2	Value 2
8 rows fetched in 0.0004s (0.0008s)

Table ' table2 '. Scan Count 5, logical reads, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB read- Ahead reads 0. Table ' table1 '. Scan count 1, logical reads 2, physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB physical reads 0, LOB Read-a Head reads 0. SQL Server execution times:   CPU time = 0 ms,

This query is instant, as it should be.

The plan is quite simple:

SELECT  Inner Join (Nested Loops)    Clustered Index Scan ([20090716_cross].[ Table1])    Top      Clustered Index Scan ([20090716_cross].[ Table2])

For the row from table1 , the It just takes first row_count rows from table2 . And so fast.

Summary:

While most queries which employ CROSS APPLY can is rewritten using an INNER JOIN , CROSS APPLY can yield better execution plan and better PE Rformance, since it can limit the set being joined yet before the join occurs.

INNER JOIN vs. Cross APPLY

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

mongodb inner join inner join php php mysql inner join postgresql inner join inner join mysql php inner join sql php db2 inner join

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

INNER JOIN vs. Cross APPLY

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support