SQL query optimizes the where condition sorting fields and the mysteries of using the index by limit.

Source: Internet
Author: User

SQL query optimizes the where condition sorting fields and the mysteries of using the index by limit.
  Strange slow SQL

Let's take a look at two SQL statements.

Article 1:
Select * from acct_trans_log WHERE acct_id = 1000000000009000757 order by create_time desc limit 1000000000009003061 second: select * from acct_trans_log WHERE acct_id = order by create_time desc limit table indexes and total data conditions: index: acct_id and create_time are single-column indexes. The total data in the database is 5.018. The result set filtered by acct_id is about results: the first one is s, the second one is 0.016s. Why is this? First, both acct_id and create_time have indexes, so the query time should not be as slow as 5s. First, let's look at the first SQL Execution Plan of the execution plan:

Article 2 execution plan:

After careful observation, we will find that only idx_create_time is used for the index, and idx_acct_id is not used.

This can explain that the first SQL statement is very slow, because no index is used for the where query. Why is the second SQL statement so fast? It seems incredible. After figuring out the principle of mysql query, we can see that the two SQL queries use the where order by limit statement when a limit statement exists, the query order may change. In this case, it is not from the database that first uses where filtering, then sorting, and then limit. If so, filtering from 5 million data through where will not be 5s. At this time, the execution order is: first, according to The idx_create_time Index Tree, from the rightmost leaf node, n records are retrieved in reverse order, and then one by one matches with the where condition. If the match is found, a data record is obtained, why is the second SQL statement faster until 10 SQL statements are obtained? Thanks to luck, the first few SQL statements in reverse chronological order are satisfied. After figuring out the principle, we learned why the first slow and second fast, but the problem is that mysql does not need the idx_acct_id index. This is a problem because, the index we created is basically invalid. In this type of SQL, the query efficiency will be quite low because the result set filtered by acct_id is large, with tens of thousands of results, mysql considers that if no index is used for sorting by time, it will be filesort. This will be slow, and neither of the two indexes can be used. Therefore, idx_create_time is selected. Why does mysql only use one index?Why can't I use two indexes? Many people may not know why. In fact, the principle is very simple. Each index is an index tree in the database, its data node stores pointers to actual data. If an index is used for query, its principle is to retrieve the pointers from the index tree and then retrieve the data, if you use an index to obtain the filtered pointer, then if you filter another conditional index, you will get two sets of pointers. If you take the intersection at this time, it may not be very fast, because if each set is very large, when the intersection is obtained, it is equal to scanning two sets, and the efficiency will be very low, so there is no way to use two indexes. Of course, sometimes mysql will consider creating a joint index temporarily to combine two indexes for use, but not in every situation, the same principle, After retrieving the result set with an index, in sorting, another index cannot be used. In fact, using the index idx_acct_id is usually faster than using the index idx_create_time. For example, select * from acct_trans_log force index (idx_acct_id) WHERE acct_id = 1000000000009000757 order by create_time desc limit time-consuming: 0.057s we can see that the idx_acct_id index is relatively fast, so is this the case? No index is used for sorting, there are always hidden risks. The combined index allows both the where and sorting fields to use the index at the same time.Let's take a look at the next SQL: select * from acct_trans_log force index (idx_acct_id) WHERE acct_id = 3095 order by create_time desc limit time consumed: 1.999s execution plan:

The combined index enables both the where condition field and the sorting field to be indexed. The problem is solved!

  Principles of joint IndexingBut why can we solve this problem? At this time, we may remember that the joint index can solve the problem of where filtering and sorting without understanding its principles. This is wrong, when the situation changes, the limit is forced. Let's look at another SQL statement: select * from acct_trans_log force index (idx_acct_id_create_time) WHERE acct_id in (3095, limit 0009000757) order by create_time desc limit 0: 1.391s index still uses idx_acct_id_create_time. The execution plan is:

 

Looking at the execution plan, filesort is used for sorting, that is, no index is used for sorting.

Let's take a look at the indexing sorting principle. Let's first look at an SQL: select * from acct_trans_log ORDER BY create_time limit 0,100 time consumed: 0.029s the execution plan is:

Here, the execution step is to first extract the first 100 entries from the index tree in ascending chronological order, because the index is sorted in ascending order, and then you can directly traverse it in the left order.

Therefore, mysql does not sort data here. If you want to sort data in descending order, you can traverse the index tree in the right order and retrieve 100 entries. The query speed is good, so when you combine indexes, what is it like? Select * from acct_trans_log WHERE acct_id = 3095 order by create_time desc limit use the composite index: idx_acct_id_create_time. Because acct_id is the prefix of the composite index, you can quickly perform a search, if the SQL statement is select * from acct_trans_log WHERE acct_id = 3095, the data is sorted by 3095 + time13095 + time23095 + time3 in ascending order by default. That is to say, the second SQL statement is equivalent to select * from acct_trans_log WHERE acct_id = 3095 order by create_time. They are equivalent. If we replace the condition with order by create_time desc limit 3095, then we should traverse the leaf node on the right of the idx_acct_id_create_time tree in reverse order, and retrieve the first 10 results because the data prefix is, the suffix is in ascending chronological order. Then the data we traverse in reverse order is exactly the same as order by create_time desc, so no sorting is required. So the statement: select * from acct_trans_log force index (idx_acct_id_create_time) WHERE acct_id in (3095, 000000000009000757) order by create_time desc limit why cannot I use indexes for sorting? We first analyze the index sorting rules known: id1 <id2 <id3... time1 <time2 <time3 .... the sorting of query result sets is as follows: id1 + time1id1 + time2id1 + time3id2 + time1id2 + time2id2 + time3 the default sorting is as follows: IDs are ordered and time is unordered, because there are two IDs, sorting by id is prioritized, and the time is messy. In this way, filesort will be used for sorting. This is the reason for the slowness and the reason why sorting is not used for indexing. Query plan usage and usage instructions
Table: the type of the table to which the data row is displayed. The type from the best to the worst is const, eq_ref, ref, range, index, allpossible_keys: displays indexes that may be applied to this table. If it is null, there is no possible index key: actually used index. If it is null, no index is used. Key_len: the length of the index used. The shorter the length, the better. ref: displays which column of the index is used. If possible, it is a constant rows: mysql considers that the number of rows that must be checked to return the requested data

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.