Details the subquery operation in Mysql

Details the subquery operation in Mysql _mysql

Last Update:2017-01-19 Source: Internet

Author: User

Tags scalar

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Continue to do the following preparatory work:

Create a new test database testdb;

  Create DATABASE TestDB;

Create test tables table1 and table2;

   CREATE TABLE table1
   (
     customer_id VARCHAR) not null, City
     VARCHAR (a) not NULL,
     PRIMARY KEY (Customer _id)
   ) engine=innodb DEFAULT Charset=utf8;

   CREATE TABLE table2
   (
     order_id INT not NULL auto_increment,
     customer_id VARCHAR (),
     PRIMARY KEY ( order_id)
   ) Engine=innodb DEFAULT Charset=utf8;

inserting test data;

   INSERT into table1 (customer_id,city) VALUES (' 163 ', ' Hangzhou ');
   INSERT into table1 (customer_id,city) VALUES (' 9you ', ' Shanghai ');
   INSERT into table1 (customer_id,city) VALUES (' tx ', ' Hangzhou ');
   INSERT into table1 (customer_id,city) VALUES (' Baidu ', ' Hangzhou ');

   INSERT into table2 (customer_id) VALUES (' 163 ');
   INSERT into table2 (customer_id) VALUES (' 163 ');
   INSERT into table2 (customer_id) VALUES (' 9you ');
   INSERT into table2 (customer_id) VALUES (' 9you ');
   INSERT into table2 (customer_id) VALUES (' 9you ');
   INSERT into table2 (customer_id) VALUES (' TX ');

When the preparations are done, table1 and table2 should look like this:

   Mysql> select * FROM table1;
   +-------------+----------+
   | customer_id |
   +-------------+----------+
   | 163 |     Hangzhou |
   | 9you    | shanghai |
   |    Hangzhou
   | | TX     | hangzhou |
   +-------------+----------+
   4 rows in Set (0.00 sec)

   mysql> select * from table2;
   +----------+-------------+
   | order_id | customer_id
   | +----------+-------------+
   |    1 | 163     |
   |    2 | 163     |
   |    3 | 9you    |
   |    4 | 9you    |
   |    5 | 9you    |
   |    6 | TX     |
   +----------+-------------+
   7 rows in Set (0.00 sec)

Prepare to do the same, start today's summary.
A problem

Now you need to query all Hangzhou users of all the order number, this SQL statement how to write? First, you can write this:

Select table2.customer_id, table2.order_id from table2 join table1 on table1.customer_id=table2.customer_id where table1 . city= ' Hangzhou ';

Can achieve the results we need. However, we can also write this:

Select customer_id, order_id from Table2 where customer_id to (select customer_id from table1 where city= ' Hangzhou ');

Uh? What is the SELECT statement in the () bracket? The question is, what is the syntax and how can you accomplish the task, then this blog post revolves around this issue.
What is a subquery?

Simply put, subqueries are:

As shown in the figure above, the subquery, called internal query, relative to the internal query, including internal query is called an external query. Subqueries can contain any clauses that a normal select can include, such as: Distinct, group BY, order by, limit, join, and union, but the corresponding external query must be one of the following: SELECT, INSERT, update , delete, set, or do.

We can use subqueries in the WHERE and having clauses to make the results of a subquery a condition for judgment.
to subquery using comparisons

A subquery returns a scalar (a value), a row, a column, or a table called a scalar, row, column, and table subquery.

When a subquery returns a scalar, we can use the comparison character in the WHERE or HAVING clause to directly judge the result of the subquery. For example, I'm now going to get customer_id, city, and order numbers that are more than the number of user TX orders, how this SQL statement is written.

First of all, I write the general steps of SQL:

Read and understand the needs;
Get the number of customer_id, city, and corresponding orders that are more than the number of user TX orders.
Take a look at what field information you ultimately need to get;
The final need to get customer_id, city and order number information.
Analyze which tables are involved in the field information;
involves table table1 and table table2.
How these tables are related;
The Association of Table Table1 and table table2 is in the customer_id field.
Decomposition of demand, get a small demand;
The number of orders required to get TX users;
The number of orders to be obtained from other users;
Compare the number of orders.

Confirm the filtration conditions for each small demand;
Get the result of each small demand, assemble it, get the final result.

Finally, I'll write down the SQL statement:

Select Table1.customer_id,city,count (order_id) from 
table1 join table2 on 
table1.customer_id= table2.customer_id 
where table1.customer_id <> ' TX '
GROUP by customer_id having 
count (order_id) > 
            (SELECT count (order_id) from 
             table2 
             where customer_id= ' TX ' 
             Group by customer_id);

In the query above, the subquery is used, and the result of the external query and the subquery is compared and judged. If the subquery returns a scalar value (on a value), then the external query can use: =, >, <, >=, <=, and <> symbols for comparison judgments; If the subquery returns a scalar value, The external query uses the comparison and the results of the subquery to compare, and then throws an exception.
subqueries using any

The above uses the comparison character to subquery, which stipulates that the subquery can only return a scalar value, but what if the subquery returns a collection?

No problem, we can use any, in, some, or all to make conditional judgments on the return result of the subquery. Here we summarize using any to subquery.

Any keyword must be used in conjunction with the comparison operator summarized above; Any keyword means "returns true for any value in the column returned by the subquery, if the comparison is true."

Like "Ten >any (11, 20, 2, 30)", because of the 10>2, the judgment returns True, and returns True if 10 is compared to any one in the collection and gets true.

For example, I now want to query the ID, city, and order number of users who have more orders than customer_id TX or 9you.

I can get the following SQL statements to complete the requirements.

Select Table1.customer_id,city,count (order_id) from
table1 join table2 on
table1.customer_id= table2.customer_id
where table1.customer_id<> ' tx ' and table1.customer_id<> ' 9you '
GROUP BY CUSTOMER_ID has
count (order_id) > Any
(
select count (order_id) from
table2
where customer_ Id= ' TX ' or customer_id= ' 9you '
Group by customer_id);

Any meaning is better understood, the literal translation is any one, as long as the condition satisfies any one, returns True.
subqueries using in

Subqueries using in, which we often encounter when writing SQL on a daily basis. In means that the specified value is in the collection and returns true at the same point, otherwise it returns false.

In is the alias of "=any", where "=any" is used, we can use "in" to make the substitution. Here is not an example, enjoy the imagination, play it yourself.

With in, there must be not in;not in is not and <>any is the same meaning, not in and <>all is a meaning, about all, the following will be summed up immediately.
using some to query subqueries

Some is an alias for any and uses less. Just to understand the meaning of any, here is not to do too much summary. Please refer to the any section above for a summary.
subqueries Using all

All must be used in conjunction with the comparison operator. All means "returns true for all values in the column returned by the subquery, if the comparison results are true."

Like "Ten >all" (2, 4, 5, 1) ", because 10 is greater than all the values in the collection, this judgment returns True, and if it is" ten >all (20, 3, 2, 1, 4) ", then the judgment returns false because 10 is less than 20.

The synonym for <>all is not in, which is not equal to all the values in the collection, and this is easy to mix with <>any, usually leave more snacks on the good.
Standard Quantum Query

The subquery can be divided into standard quantum query and multivalued subquery according to the number of returned values of the subquery. When a subquery is used, it is required to be a scalar quantum query, and if a multivalued subquery is used, the exception is thrown.
Multi-value subqueries

A multivalued subquery corresponds to a scalar query, and a multivalued subquery returns a column, a row, or a table, which makes up a set of values. We generally use the words "any", "in", "All" and "some" to judge the results of an external query and a subquery. If the words "any", "in", "All" and "some" are queried with the scalar quantum, the result will be empty.
Standalone subquery

A stand-alone subquery is a subquery that runs without relying on an external query. What do you mean, relying on external queries? Look at the following two SQL statements first.

SQL statement 1: Gets the order number of all Hangzhou customers.

Select order_id from 
table2 
where customer_id into 
          (select customer_id from 
          table1 
          where city= ' Hangzhou ');

SQL Statement 2: Obtains the city for Hangzhou, and exists the user of the order.

SELECT * from 
table1 
where city= ' Hangzhou ' and exists
                (SELECT * from 
                table2 
                where TABLE1.CUSTOMER_ID=TABLE2.CUSTOMER_ID);

The above two SQL statements, although the example is somewhat inappropriate, is enough to illustrate the problem here.

For SQL statement 1, we copy the subquery separately, and we can do it separately, which is that the subquery has nothing to do with the external query.

For SQL Statement 2, we will not be able to execute the subquery separately, because the subquery for SQL Statement 2 relies on some fields of the external query, which causes the subquery to depend on the external query, resulting in a dependency.

For subqueries, the problem of efficiency is often taken into account. When we execute a SELECT statement, we can add the Explain keyword to view the query type, the index used in the query, and other information. For example, use this:

Explain select order_id from 
  table2 
  where customer_id into 
            (select customer_id from 
            table1 
            where city= ' Hangzhou ');

With a standalone subquery, if the subquery section has a maximum traversal number of n for the collection, and the maximum number of traversal times for the outer query is M, we can write it as: O (m+n). If the associated subquery is used, the number of times it will traverse May reach O (m+m*n). You can see that the efficiency will be doubled; So, when you use subqueries, be sure to consider the dependencies of subqueries.

For more explanation of explain, please refer to here.
Related subqueries

A correlated subquery refers to a subquery that references an external query column, that is, a subquery evaluates each row of the outer query. But within MySQL, dynamic optimizations can be made differently depending on the situation. Using related subqueries is where performance is most likely to occur. And on the optimization of SQL statements, this is a very big topic, only through the actual accumulation of experience, to better understand how to optimize.

About the performance of SQL, I can not say anything here, if just read other people's articles to consider performance issues, in fact, there is no feeling, we need real projects in order to better understand.
exists predicate

Exists is a very cross predicate that allows the database to efficiently check whether a specified query produces certain rows. The predicate returns TRUE or false, depending on whether the subquery returns a row. Unlike other predicates and logical expressions, exists does not return unknown regardless of whether the input subquery returns rows, Unknown is false for exists. Or the above statement, get the city for Hangzhou, and there are users of orders.

SELECT * from 
table1 
where city= ' Hangzhou ' and exists
                (SELECT * from 
                table2 
                where table1.customer_ ID=TABLE2.CUSTOMER_ID);

With explain, you'll get the following:

We can clearly see that there is an associated subquery (DEPENDENT subquery). You can see that exists and in are very similar, so what is the difference between them?

The main difference between in and exists lies in the judgment of the three-valued logic. exists always returns TRUE or false, and for in, it is possible to return unknown for null values in addition to TRUE or false. In a filter, however, unknown is handled in the same way as false, so the SQL optimizer chooses the same execution plan using in as with exists.

When it comes to in and EXISTS almost the same, however, you have to talk about not in and not EXISTS, the difference between not EXISTS and not in is very large for the input list to contain null values. When the input list contains null values, in always returns True and UNKNOWN, so that not in will get not true and not UNKNOWN, that is, false and UNKNOWN.

mysql> Select ' C ' not in (' A ', ' B ', NULL) \g;

Execute the above code and see the results. You'll be surprised.
Derived Tables

As mentioned above, it is also possible to return a table in the value returned by a subquery, which becomes a derived table if the virtual table returned by the subquery acts as input from the FROM clause again. The syntax structure is as follows:

From (subquery expression) as Derived_table_alias

Since a derived table is a complete virtual table, it is not and cannot be physically materialized.
Summary

Finally summed up almost, of course, the subquery is still a lot of things, it is impossible to summarize the end of an article, here is just a few basic concepts, common knowledge points are summarized, about the use of subqueries to update, delete and INSERT statements, I do not address , which are largely the same. Knowledge of this thing, launched, there is no head, or need to stop, appropriate depth excavation, but the depth is best not more than 2, about this 2 how to define, own grasp. Well, this is the end of the article, we'll see you in the next article.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More