Interesting questions about subqueries and connections

Source: Internet
Author: User
Today, a classmate made a joke. A: Random (selectidfromanalysiswherewordsomething); B: selectb. titlefromanalysisajoinsimplifiedbon (a. idb. id) wherea. wordsomething; C: selectsimplified

Today, a classmate made a joke. A: select title from simplified where id in (select id from analysis where word = something); B: select B. title from analysis a join simplified B on (. id = B. id) where. word = something; C: select simplified

Today, a classmate made a joke.

A:
Select title
From simplified
Where id in (select id
From analysis
Where word = 'something ');
B:
Select B. title
From analysis
Join simplified B
On (a. id = B. id)
Where a. word = 'something ';
C:
Select simplified. title
From analysis
Join simplified
On (analysis. id = simplified. id)
Where analysi. word = 'something ';

The preceding three SQL statements are executed in MYSQL. Which of the following statements is the most efficient?

When I saw this question, I remembered the question I asked when I was an interviewer. From another perspective, I have collected some thoughts about this question and summarized it into the following interview story.

This type of questions is really suitable for the interview to evaluate the knowledge of SQL.

Simulate a scenario where the interviewer is Tom and the interviewer is Tom. Next we will start the interview process .....

Xiao 'er:.... I have a small problem here. You can see (the above three SQL sections are provided), balabala ....

Tom: What about the data volume of the two tables? The optimization of join and subquery may vary depending on the data volume. A subquery is suitable for situations where the external result set is large and the subquery result set is small. It is best to ensure that the result set returned by the subquery is as small as possible.

Xiao'er: Assume that the data volume in two tables is almost the same, with around records.

Tom: If this is the case, A should be slower. If it is executed on mysql, the subquery Statement considers it more efficient to associate with the external simolified table. In this case, A is actually translated:

Select title from simplified where exists (select simplified. title from analysis where word = 'something' and id = simplified. id );

This in subquery method is inefficient when the external table (such as the simplified above) has a large amount of data.

Xiao'er: Well, what is the difference between B and C?

Tom: B and C should be similar in my opinion. The main difference is that B uses an alias. But I think it should have little impact.

Because the database itself will replace the table name with its own alias during execution. However, I remember that the optimization advice for a database is to try to connect as many tables as possible. It is best to use the table alias to reference columns.

By the way, you can index the id.

Xiao'er: 2. The table IDs are all primary keys. Is this an index?

Tom: Yes, the primary key can ensure that the record is unique and the primary key field is not empty. The database management system automatically generates a unique index for the primary key, so the primary key is also a special index.

Xiao'er: Well, a table can only have one primary key. Can it be said that there can only be one unique index?

Tom: No. The primary key must be a unique index. The unique index is not necessarily a primary key. The primary key can uniquely identify the attribute or attribute group of a row in the table. A table can only have one primary key, however, there can be multiple candidate indexes.

Xiao 'er:Is there a difference between a primary key column and a unique index column?

Tom: Yes. I know that the primary key column does not allow null values, while the unique index column does. A table can have multiple unique indexes, but only one primary key.

Xiao'er: Well, back to the above question, you remember to use the table alias to reference columns for multi-table join. Why did you think about it?

Tom:I guess it should be related to the internal processing logic of Mysql. It may be that if an alias is used, a temporary table will be created and put into the memory, so that the subsequent hits will be higher.

Xiao 'er:The mysql engine automatically generates the temp table, resulting in cache effect. From this SQL statement, you should be able to understand its intention. If you want to write better statements, try it...

Tom:Okay, I think .....

Tom:

Select B. title from
(Select id from analysis where word = 'something'),
Simplified B
Where a. id = B. id;

If so, it should be the fastest. Discard subqueries and joins.

Xiao'er: Well, good .....

......

If I encounter such a problem during the interview, it will definitely have a profound impact.

Original article address: I would like to thank the original author for sharing the interesting questions about subqueries and connections.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.