Execution plan for in (constant list) in MySQL

Source: Internet
Author: User

When we write SQL, we often use in,in followed by a list of constants, such as ID. Some say in is very efficient, and some say it is very low, some say in can use index, others say in cannot use index ...
What kind of a situation is it? We analyze the following situations
Before that, let's look at some of the type types of explain (this analysis refers to type types), by performance from high to Low:

Const: The maximum value of a record in a table can match this query (the index can be a primary key or a unique index). Because there is only one row, this value is actually a constant, because MySQL reads the value first and treats it as a constant.

Eq_ref: In a connection, when MySQL queries, from the previous table, the union of each record reads a record from the table, which is used when the query uses the index as the primary key or the unique key.

Ref: This connection type occurs only if the query uses a key that is not a unique or primary key or is part of these types (for example, using the leftmost prefix). For each row union of the previous table, all records are read from the table. This type is heavily dependent on how many records are matched against the index-the less the better

Range: This connection type uses an index to return rows in a range, such as what happens when you use > or < to find something

Index: This connection type is fully scanned for each record in the previous table (better than all because the index is generally less than the table data)

All: This connection type is fully scanned for each previous record, which is generally bad and should be avoided as much as possible.

One, in the back of only 1 values
1.1 For a primary key or a unique index, then Type=const, this performance is highest, indicating that only 1 records in the table can satisfy the query



1.2 for normal index, or federated primary key, Type=ref


1.3 for normal fields, Type=all, this performance is the worst



Second, in the back of the extra 1 values, but less than a certain value, this value is exactly how much, then will be revealed.
At this point, whether it is a primary key, or a unique index, or a normal index, type=range


But one of the exceptions to note here is that when your index's cardinality property is lower, type=all means that the index is very low-level, and it doesn't make much sense.
Then his execution plan type=all,mysql that the index is not as good as a full table scan:

What is the best performance when cardinality at what level? It is generally considered that the closer the value is to count (*), the better performance. For a field like sex, there's no need to index it.




Third, in contrast to the second case, when the value in the back is more than a certain value, it causes the whole table to be scanned. I'm not sure about this experience, I've consulted the DBA, and they can't give it their experience.
3.1 test1 Table A total of 27 data, when the value behind in less than 8, type=range, and when more than 8, Type=all, as follows:





3.2 This T_word_cost table, a total of 38,244 data, word_id on the index, I tested a pair, when in the back of not more than 5,000 or more than 6,000, type=range, what does this mean?
Because what puzzled me was that the value was not sure, it was volatile, I executed it several times, sometimes more than 5,000, sometimes more than 6,000, or other values

By testing the Test1 and T_word_cost tables, I did not find the law, I was delusional, through a lot of practice to get an empirical value, and then through the empirical value to determine the number of in the back of the count (*) Percentage of the time, can walk index, it seems I was in vain.


Since we cannot get the experience, we can only choose the solution in the actual application environment.
For example, this time I operate the T_word_cost table, in the face value of the number of more than count (*), if all the one-time write in the back, a query is about 30-40s!
According to the results of my analysis above, it seems that I should choose 5000 as the threshold value, then batch, multiple queries, so that performance should be the highest. But is that the truth?
Through my ongoing human testing in the program found that not 5000 time consuming the least, choose 2000 or 2500, the overall time is the least, as to why, may be related to memory, frequent database connections,
Because we know that memory, IO will affect the overall performance, so how to balance this degree need to grasp.

Execution plan for in (constant list) in MySQL

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.