Mysql--bnl/icp/mrr/bka

Last Update:2018-06-28 Source: Internet

Author: User

Tags mysql version

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

#======================================================##
MySQL Association query algorithm:
BNL (Block nested-loop)
ICP (Index Condition pushdown)
MRR (Multi-range Read)
BKA (batched Key Access)

#======================================================##
BNL (Block nested-loop)
Scene:
Suppose TB1 and TB2 are associated with a query to TB1 to look for a matching row of rows in TB2, but because there are no indexes available in TB2, the data for the entire T2 table needs to be scanned, so the number of data rows for the outer TB1 determines the number of scans for the inner TB2.

Optimization:
The data rows of the outer table TB1 are split into n blocks, each block contains m data, the TB2 is scanned n times, and each row of the TB2 data is scanned to match the data of a block, reducing the original scan count from M*n to n times for the TB2 table.

Focus:
1. There is no index available in the table
2, the inner table and the appearance of the order can not be swapped, such as the left JOIN operation

The algorithm already exists in MySQL version 5.1.

#======================================================##
ICP (Index Condition pushdown)
Scene:
Suppose the table TB1 has an index IDX_C1_C2_C3 (C1,C2,C3), for the query select * from TB1 WHERE c1= ' xxx ' and c3= ' xxx '

Prior to MySQL version 5.6, due to the lack of C2 filtering conditions, the InnoDB storage engine layer can only use the index idx_c1_c2_c3 the c1= ' XXX ' criteria to find all eligible index records, and then based on these index records to find in the clustered index, The found table data is returned to the MySQL server layer and then filtered by the MySQL server layer using the c3= ' XXX ' condition to get the final result.

In the MySQL 5.6 version of the introduction of the ICP feature, the INNODB storage engine layer can only use the index IDX_C1_C2_C3 according to c1= ' xxx ' condition to scan all eligible index records, and then filter these index records according to the c3= ' xxx ' condition. and follow the filtered index records to go to the clustered index to find, the found table data back to the MySQL server layer, the final result.

Assuming that the data behavior of the c1= ' xxx ' condition satisfies the 100,000 bar, while satisfying c1= ' xxx ' and c3= ' xxx ' data behavior 100, then:
1. In MySQL version 5.5, the TB1 clustered index needs to be 100,000 times the index seek operation, and the InnoDB storage engine layer passes 100000 rows of data to the MySQL server layer.
2. In MySQL version 5.6, the use of ICP requires only the index seek operation of the TB1 clustered index 100 times, and the InnoDB storage engine layer passes 100 rows of data to the MySQL server layer.

By "Sinking" the filter condition from the MySQL server layer to the storage Engine layer, the ICP achieves:
1, reduce the number of operations to find the clustered index;
2, reduce the amount of data returned from the storage engine layer to the MySQL server layer;
3. Reduce the number of times the MySQL server tier accesses the storage engine layer.

The PS1:ICP is used only for nonclustered indexes.
PS2: In MySQL 5.6, only normal tables are supported for ICP operations, while MySQL 5.7 supports ICP operations on partitioned tables.
#======================================================##
MRR (Multi-range Read)
Suppose the table TB1 has an index IDX_C1 (C1), for the query select * from TB1 WHERE C1 in (' XXX1 ', ' XXX2 ',...., ' xxxn ')

Prior to MySQL version 5.6, the IDX_C1 was indexed by the c1= ' XXX1 ' condition, and then the corresponding data records were found in the clustered index of TB1, followed by c1= ' XXX2 ' ... Operation to C1= ' Xxxn ', sets the result of each operation and obtains the final result set. Because the indexed records contained in the index record based on the C1 condition are unordered, resulting in more random IO on the index seek operation of the clustered index, affecting the server storage performance.

Introducing the MRR feature in MySQL version 5.6, first follow c1= ' XXX1 ' .... and c1= ' XXXM ' conditions to find an index record that satisfies the condition is placed in buffer, and when buffer is full, the index record in buffer is sorted by the clustered key, the corresponding record is found in the clustered index according to the sorted result, It can effectively change the original random lookup to sequential lookup, convert some random io into sequential io, prompt query performance, and reduce the consumption of query to server IO performance.

PS1:MRR is also applicable only to nonclustered indexes, and result sets obtained from nonclustered indexes are randomly unordered on the clustered key.
PS2: Assuming that the TB1 clustered index above is an ID, then IDX_C1 (C1) is equivalent to IDX_C1 (C1,id), and if only a single equivalent query is made to a nonclustered index, the resulting set of results is ordered for the clustered key without the use of the MRR attribute.
The size of the buffer involved in PS3:MRR depends on the setting of the parameter read_rnd_buffer_size

#======================================================##
BKA (batched Key Access)
Scene:
Assume that TB1 and TB2 are associated with queries to TB1 to match the appearance of loops to TB2, and there are indexes on the table TB2 that can be used.

Before the MySQL 5.6 version, only the data in the loop TB1 is indexed on TB2, and if the data on the TB1 is unordered, then the index lookup for TB2 is also random, resulting in a large number of random IO operations.
In MySQL version 5.6, according to MRR, the data in the TB1 is first put into buffer, and when buffer is full, the data in buffer is sorted by the correlation key, and then the TB2 is indexed in an orderly way, and some random IO operations are converted to sequential IO operations.

Since Ps1:bka has been in MRR, the MRR feature must be turned on to use BKA, but mrr_cost_based is officially recommended because MRR-based cost estimates do not guarantee mrr_cost_based usage.
The size of the buffer used by the PS2:BKA depends on the parameter join buffer size

#======================================================##
Set turn on MRR and BKA and turn off mrr_cost_based
SET optimizer_switch= ' Mrr=on,mrr_cost_based=off,batched_key_access=on ';

#======================================================##
The difference between BKA and BNL:
1, the internal table index, BKA requires that the table has an index that can be used, while the BNL is because the inner table does not have the index to use and the last resort optimization
2, the BKA algorithm aims to reduce the random index seek operation and reduce the random IO, while the BNL algorithm is designed to reduce the number of scans on the internal table and reduce the IO overhead of scanning.

#======================================================##

Mysql--bnl/icp/mrr/bka

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More