Analysis of Index Condition Pushdown (under the ICP Index Condition) and Multi-Range Read (MRR Index Multi-Range lookup) query optimization in MySQL,

Source: Internet
Author: User

Analysis of Index Condition Pushdown (under the ICP Index Condition) and Multi-Range Read (MRR Index Multi-Range lookup) query optimization in MySQL,

 

Source: http://www.cnblogs.com/wy123/p/7374078.html
(The Source retained is not the right of original works. My work is far from reaching this level, just to link to the original article, because some possible errors will be corrected or supplemented later, without him)

 

 

ICP Optimization Principle

Index Condition Pushdown (ICP), also known as index push-down, is reflected in the execution plan, there will be Using Index condition (Extra column, of course, Extra column information is too much, only simple analysis can be performed)
In general, the ICP principle is that, during the query process, when the query engine layer APIs obtain data, the "non-direct Index" filtering conditions are filtered, instead of filtering on the Server layer after query by the query engine layer.
In other words, when obtaining the data, the ICP can filter the data that cannot be directly indexed in the where clause, this avoids the implementation of two steps in the absence of ICP optimization (the data acquisition process is not filtered by the secondary selection criteria)
For non-ICP optimized queries, the first step is to obtain the data, and the second step is to obtain the data for conditional filtering.
Obviously, compared with the latter, the former can implement index search Seek + filter step by step, which is more efficient.

Applicable scenarios:
The ICP optimization policy can be used for data access methods of the range, ref, eq_ref, and ref_or_null types.

 

In fact, there are no instances that are not very familiar with this optimization strategy. Let's take two examples.

 

ICP optimization instance

The first example is a lot of online and easy to understand. For the specific table structure, see the above (http://www.cnblogs.com/wy123/p/7366486.html)

The index of the test_orderdetail table used below is: create index idx_orderid_productname on test_orderdetail (order_id, product_name );
The query statement is: select * from test_orderdetail where order_id = 10900 and product_name like '% 00163e0496af % ';
Obviously, order_id = 10900 can be used for index search directly. Although product_name is included in the composite index, the index cannot be used for product_name like '% 00163e0496af % '.
Observe the execution plan and find that the Extra contains Using index condition.

The Optimization Principle of ICP is,
When you use the first condition order_id = 10900 for index search, you cannot directly use the condition product_name like '% 00163e0496af %' to filter indexes.
The final step is to filter out the results.

Compare the cases of disabling ICP Optimization
If you disable ICP optimization, the execution plan Extra is displayed as Using where,
This means that after you use order_id = 10900 for index search, you can filter the result set by product_name like '% 00163e0496af % '.

  

In the second example, I will think about it later. In order to verify the scenario of the ICP and whether it is indeed better than non-ICP optimization

The table used this time is test_order, and the index on test_order is create index idx_userid_order_id_createdate on test_order (user_id, order_id, create_date );
The query statement is: select * from test_order where user_id = 500 and create_date> '2017-1-1 ';
As in the preceding example, indexes cannot be directly used in the second filter condition.

First, let's take a look at the difference between the two execution plans in ICP optimization.

The execution plan after the ICP is disabled

Then, when you open and close the ICP, observe the profile information during the execution.

View the execution details of the two SQL statements, that is, when you enable and disable the ICP optimization, the following is an order of magnitude difference in the stage, SQL, and Sending data procedures.
This means that through the optimization of the ICP mechanism, the number of data interactions between the server layer and the engine layer is reduced.

  

Reference one sentence in MySQL · Feature Analysis · Index Condition Pushdown (ICP:
When a secondary index is a composite index and the filtering condition is low, enabling the ICP can effectively reduce the number of interactions between the server layer and the engine layer, thus effectively reducing the running time.

 

Finally, let's think about another question,
For this query, select * from test_orderdetail where order_id = 10900 and product_name like '% 00163e0496af %,
If order_id is included in a secondary index but product_name is not included in this secondary index, Will MySQL be optimized using the ICP method?
The answer is no.
Because the two query conditions in the ICP domain are covered by the index, but the secondary selection condition cannot be directly searched by the index, if the secondary selection condition is not covered by the index, the value of the secondary selection condition cannot be known, so there is no way to optimize the index push-down condition.

  

 

  

 

Multi-Range Read (MRR)

Non-MRR optimization problems:
First, let's take a look at the background: MySQL's Innodb tables are all clustered index tables. If no clustered index is explicitly specified, a clustered index is automatically generated.
When a range query is performed using a secondary index (or a non-clustered index, secondary indexes perform data search (back-to-table operations) based on the clustered indexes stored by leaf nodes in the B-tree structure ),
However, the data that meets the condition (the data that the secondary index is not found) may be randomly distributed in any part of the clustered index B tree, so that there may be too many random IO on the table.
When the table is very large, the search process for each row needs to be performed randomly on the disk, which may affect the performance.

For example,
For example, referring to the moving track of the Blue Line, the physical location of the target data row found by the secondary index is 1, 2, 3, and 4 (in what order is the main order to obtain the data at these four locations, it can be obtained in a random way or in an ordered manner)
When we look for the data at these four locations, if we look for the data in the order of clustered indexes corresponding to the secondary index,
In the case of secondary index sorting, the order of the corresponding clustered index may be random, so the physical location of the corresponding data is random.
If the corresponding data rows are searched by the secondary index's backward return table, random IO lookup is required.
The disadvantage of this query method is that when the four rows of data are queried, the physical location varies greatly, the head needs to be carried back and forth (random IO reading ).

The objective of MRR multi-range read optimization is to sort record read requests and then read data rows in sequential IO mode to avoid random IO
Which field is sorted? In my opinion, it can be understood as sorting the key values of the corresponding clustered index found in the secondary index range.
The sequential scanning process can be considered:

(1) finding the key value of the clustered index of the target data through non-clustered Index
(2) sort the key values of the clustered index of the target data found through the secondary index. The clustered index corresponds to the physical location one by one.
(3) (the process of going back to the table) through the ordered clustered index corresponding to the secondary index, execute an ordered disk scan to obtain data, thus speeding up Data Reading.

Sequential disk reading is usually faster. Of course, it does not mean that the efficiency of this method is always high. There are both advantages and disadvantages and exceptions.

1. If a small data range is scanned and the target data is already cached on the disk, the only impact of MRR is that additional CPU overhead is added to buffer/sort.
2. order by *** LIMIT n queries. When the n value is small, it may become slower,
The reason is that MRR tries to read the disk in sequence (to obtain or retrieve data). The data read at the beginning may not always match the first N records (order.
3. MRR is an implementation process. In extreme cases, if MySQL does not know the number of rows of the target data,
If there is only one row, you still need to sort the data and then read the data rows back to the table. This is not worth the candle.

Enable MRR Optimization
Set global optimizer_switch = 'mrr = on, mrr_cost_based = off ';

When MRR optimization is enabled, You need to search for bookmarks, that is, you need to go back to the table. If you do not need to go back to the table, the secondary index itself can query the required fields, there is no random IO opportunity.

As follows: If order_status is removed, the MRR optimization will not occur if you do not need to query the table.

At the same time, once MRR optimization occurs, the order of the queried results must be sorted by clustered indexes. This principle should not be hard to understand.

  

 

Of course, MRR optimization also provides optimization measures in the case of table Association. The principle is similar.

 

Summary:

Index Condition Pushdown (Index-based push-down) and Multi-Range Read (Multi-Range Read) are both alternative options for MySQL to improve query optimization and are new in MySQL.
Not long after the landlord came into contact with MySQL, he was not knowledgeable enough. He felt very fresh and did not need to spray it.
The common characteristics of the two are some optimization measures in the process of using index supersearch (or index range scanning.
These optimization measures can optimize the query action during secondary index search (index range scan,
Of course, these optimization measures are not always omnipotent, allowing users to enable or disable them explicitly, giving users full freedom. However, freedom is not completely correct, this also requires users to make full trade-offs and considerations when doing related optimization.

 

Refer:

Https://mariadb.com/kb/en/mariadb/multi-range-read-optimization/
Http://blog.itpub.net/22664653/viewspace-1673682/
Http://blog.itpub.net/22664653/viewspace-1678779/
Http://mysql.taobao.org/monthly/2015/12/08/

And various online searches ......

 

 

Finally, the official mariadb figures are very good and helpful for understanding the problem. They are first stolen and used for backup (shameless smile, O (∩ _ ∩) O ~),

Suddenly I thought of being a person again. Why do I have to go straight to it? In many cases, it is not easy to get fast, take a detour, and stop for the moment. It may not be a bad thing to plan and start again.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.