Analyze and compare in and join in multiple table queries

Source: Internet
Author: User

In is the keyword of the subquery, join is the keyword of the connection, the project development often uses to the multi-table query, but the subquery and the connection is realizes the multi-table query the important way. That's how the two work. In and join which is better. Here's how to analyze and compare.

Now there are two tables of test1 and test2, and there are no constraints like primary key, foreign key, and only one field. The two tables are non-related.

Now implement the subquery using the IN keyword, test2 as the subquery table (external table):

To view the execution plan:

Use the Join keyword to implement the connection, and the same test2 as the external table:

To view the execution plan:

Analysis: When using in subqueries to implement a multi-table query, from the execution plan can be seen, the entire query is divided into 3 parts, id = 1 of the query has two, id = 2 of the query has one. High level of ID, priority to query. The query for id = 2 corresponds to the FTS of test2 (subquery table). Then the query with ID = 1 is executed from the top down. The plan shows that this query is a subquery (subquery), and when querying test1, use the join buffer (Blocked Nested loop), which is the connection buffering (blocking nested loops). When using join joins to implement a multi-table query, query the Test2 table (external table), almost as in (FTS), then query the Test1 table, and in the same way, use the join buffer (Blocked Nested Loop) that join What does buffer (Blocked Nested Loop) mean, I think this blog has been explained very clearly. http://blog.itpub.net/22664653/viewspace-1692317/Summary, in a non-correlated (non-indexed) multi-table query, the query using in and join is the first to add the query result of the external table to the connection buffer. The data from the internal table is then taken into the buffer for comparison (nested loops). The query plan is almost indistinguishable. However, in the presence of a priority relationship, a subquery query is more than join, in which case the join is better.

Now add the primary key (index) to the Test1 table, add the FOREIGN KEY constraint (index) to the Test2 table, and the two tables are related.

Make the same query and return the same result:

To view the execution plan in the mode:

To view the execution plan for the Join method:

Analysis: When querying using in mode, the subquery is no longer displayed as subquery (if the subquery has a different priority), but rather a reference process. The external table test2 is scanned with an index first, and the test1 is scanned with an index, which references the Test2 ID column. Using the Join method is also the same as having a reference procedure. At this point, the two methods of query also do not use the above mentioned connection buffer and blocking nested loops. To summarize, when two tables are related (foreign keys are connected), both in and join, the Union lookup is a reference process.

Writing here, it seems that in and join in the table related (logical foreign key), do not know which is better, the following to practice.

Practical application:

The following is a sample database Sakila using MySQL (the Customer table has 599 customer information and the primary key is customer_id. There are 16044 rows of data in the rental table, where the primary key is rental_id, and the Foreign key column customer_id the primary key in the Reference Customer table, performing in and join implementations of multiple table queries, respectively:

In query statement: Select CONCAT (first_name,last_name) from the customer where customer_id in (SELECT customer_id from rental WHERE rental _id <=16000);

Results (599 Customer Name information returned):

Slow query log:

<code class= "Hljs sql Has-numbering" style= "display:block; padding:0px; Color:inherit; Box-sizing:border-box; font-family: "Source Code Pro", monospace;font-size:undefined; White-space:pre; border-radius:0px; Word-wrap:normal; background:transparent; " ># time:160717 21:17:58 # User@host:root[root] @ localhost [127.0.0.1] id:17 # query_time:0.000000 Lock_time:
0.000000 rows_sent:599 rows_examined:1198 use Sakila; <span class= "Hljs-operator" style= "Box-sizing:border-box;" ><span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >SET</span> <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >timestamp</span>=<span class= "Hljs-number" style= "Color:rgb (0, 102, 102); Box-sizing:border-box; " >1468761478</span>;</span> <span class= "Hljs-operator" style= "Box-sizing:border-box;" ><span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; ">SELECT</span> CONCAT (first_name,last_name) <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >FROM</span> customer <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >WHERE</span> customer_id <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >IN</span> (<span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); box-sizing:border-box;" >SELECT</span> customer_id <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >FROM</span> rental <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >WHERE</span> rental_id <=<span class= "Hljs-number" style= "Color:rgb (0, 102, 102); Box-sizing:border-box; " >16000</span>);</span> </code><ul class= "pre-numbering" style= "Box-sizing:border-box; Position:absolute; width:50px; top:0px; left:0px; Margin:0px; padding:6px 0px 40px; BORDER-RIGHT:1PX Solid RGB (221, 221, 221); List-style:none; Text-align:right; Background-color:rgb (238, 238, 238); " ><li style= "Box-sizing:border-box; padding:0px 5px; " >1</li><li style= "Box-sizing:border-box; padding:0px 5px; " >2</li><li style= "Box-sizing:border-box; padding:0px 5px; " >3</li><li style= "Box-sizing:border-box; padding:0px 5px; " >4</li><li style= "Box-sizing:border-box; padding:0px 5px; " >5</li><li style= "Box-sizing:border-box; padding:0px 5px; " >6</li><li style= "Box-sizing:border-box; padding:0px 5px; " >7</li></ul>

Join query statement: SELECT CONCAT (First_name,last_name) from the customer as a INNER JOIN rental as b on a.customer_id = b.customer_id WH ERE rental_id<=16000;

The result (returned 15995 rows of data, found that there are many duplicate names):

Slow query log:

<code class= "Hljs sql Has-numbering" style= "display:block; padding:0px; Color:inherit; Box-sizing:border-box; font-family: "Source Code Pro", monospace;font-size:undefined; White-space:pre; border-radius:0px; Word-wrap:normal; background:transparent; " ># time:160717 21:19:17 # User@host:root[root] @ localhost [127.0.0.1] id:18 # query_time:0.030000 Lock_time: 0.000000 rows_sent:15995 rows_examined:16643 <span class= "Hljs-operator" style= "Box-sizing:border-box;" ><span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >SET</span> <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >timestamp</span>=<span class= "Hljs-number" style= "Color:rgb (0, 102, 102); Box-sizing:border-box; " >1468761557</span>;</span> <span class= "Hljs-operator" style= "Box-sizing:border-box;" ><span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >selecT</span> CONCAT (first_name,last_name) <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >FROM</span> customer <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >AS</span> a <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >INNER</span> <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >JOIN</span> rental <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >AS</span> b <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >ON</span> a.customer_id = b.customer_id <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >WHERE</span> rental_id<=<span class= "Hljs-number" style= "Color:rgb (0, 102, 102); Box-sizing:border-box; " >16000</span>;</span> </code><ul class= "Pre-nUmbering "style=" Box-sizing:border-box; Position:absolute; width:50px; top:0px; left:0px; margin:0px; padding:6px 0px 40px; BORDER-RIGHT:1PX Solid RGB (221, 221, 221); List-style:none; Text-align:right; Background-color:rgb (238, 238, 238); " ><li style= "Box-sizing:border-box; padding:0px 5px; " >1</li><li style= "Box-sizing:border-box; padding:0px 5px; " >2</li><li style= "Box-sizing:border-box; padding:0px 5px; " >3</li><li style= "Box-sizing:border-box; padding:0px 5px; " >4</li><li style= "Box-sizing:border-box; padding:0px 5px; " >5</li><li style= "Box-sizing:border-box; padding:0px 5px; " >6</li></ul>

Use the DISTINCT keyword to remove the heavy JOIN query statement: SELECT DISTINCT CONCAT (First_name,last_name) from customer as a INNER JOIN rental as B on A.custo mer_id = b.customer_id WHERE rental_id <=16000;

Slow query log:

<code class= "Hljs sql Has-numbering" style= "display:block; padding:0px; Color:inherit; Box-sizing:border-box; font-family: "Source Code Pro", monospace;font-size:undefined; White-space:pre; border-radius:0px; Word-wrap:normal; background:transparent; " ># time:160717 21:20:31 # User@host:root[root] @ localhost [127.0.0.1] id:19 # query_time:0.010000 Lock_time: 0.000000 rows_sent:599 rows_examined:1797 <span class= "Hljs-operator" style= "Box-sizing:border-box;" ><span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >SET</span> <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >timestamp</span>=<span class= "Hljs-number" style= "Color:rgb (0, 102, 102); Box-sizing:border-box; " >1468761631</span>;</span> <span class= "Hljs-operator" style= "Box-sizing:border-box;" ><span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >select</span> <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >DISTINCT</span> CONCAT (first_name,last_name) <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >FROM</span> customer <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >AS</span> a <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >INNER</span> <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >JOIN</span> rental <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >AS</span> b <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >ON</span> a.customer_id = b.customer_id <span class= "Hljs-keyword" style= "Color:rgb (0, 0, 136); Box-sizing:border-box; " >WHERE</span> rental_id <=<span class= "Hljs-number" style= "Color:rgb(0, 102, 102); Box-sizing:border-box; " >16000</span>;</span> </code><ul class= "pre-numbering" style= "Box-sizing:border-box; Position:absolute; width:50px; top:0px; left:0px; margin:0px; padding:6px 0px 40px; BORDER-RIGHT:1PX Solid RGB (221, 221, 221); List-style:none; Text-align:right; Background-color:rgb (238, 238, 238); " ><li style= "Box-sizing:border-box; padding:0px 5px; " >1</li><li style= "Box-sizing:border-box; padding:0px 5px; " >2</li><li style= "Box-sizing:border-box; padding:0px 5px; " >3</li><li style= "Box-sizing:border-box; padding:0px 5px; " >4</li><li style= "Box-sizing:border-box; padding:0px 5px; " >5</li><li style= "Box-sizing:border-box; padding:0px 5px; " >6</li></ul>

Parse: Because the customer_id column of the rental table is the foreign key column, the primary key customer_id of the Customer table is referenced. So two tables on the query are related tables. The above has analyzed that in and join implementation of multi-table query there is no connection buffer and blocking nested loops. However, they are all searched by reference relationships. By comparing the lookup time (SQL efficiency) with the number of rows retrieved (disk IO), in which case I choose in to query.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.