MySQL Performance Optimization-Block Nested-Loop Join (BNL)
1. Introduction
I believe that many developers/DBAs are still not satisfied with the way MySQL processes multi-Table associations or the performance when using MySQL. Join queries submitted by developers are generally resistant to join queries. Therefore, we recommend that you split the join statements to avoid performance problems that may result from join operations and increase network interaction between programs and databases.
Before MySQL 5.5, MySQL itself only supports one association method between tables, namely Nested Loop ). If the data size of the joined table is large, the join Operation takes a long time. In Versions later than 5.5, MySQL optimizes Nested execution by introducing the BNL algorithm. This article introduces two join algorithms Nested-Loop Join (NLJ) and Block Nested-Loop Join (BNL ).
Principle 2
2.1 Nested Loop Join algorithm
NLJ algorithm: Use the result set of the driving table or external table as the basic cyclic data, and then obtain data from the result set one by one as the filter condition of the next table to query the data, and then merge the results. If a multi-table join operation exists, the result set of the preceding table is used as the cyclic data, each row is retrieved, and then matched cyclically in the next joined table. The result set is returned to the client.
The pseudo algorithm of Nested-Loop is as follows:
For each row in t1 matching range {
For each row in t2 matching reference key {
For each row in t3 {
If row satisfies join conditions,
Send to client
}
}
}
Because a normal Nested-Loop transmits only one row to the inner Loop at a time, the number of rows in the outer Loop (result set) will be executed several times in the memory Loop. when an index is attached to an internal table, the scanning cost is O (Rn). If no index is available, the scanning cost is O (Rn * Sn ). If the internal table S has many records, the SimpleNested-Loops Join will scan the internal table many times, and the execution efficiency is very poor.
2.2 Block Nested-Loop Join algorithm
BNL algorithm: stores rows/result sets of the outer loop into the join buffer. each row of the inner loop is compared with the records in the whole buffer to reduce the number of inner loops.
For example, if the result set of the Outer Loop is 100 rows, You need to scan the internal table 100 times using the NLJ algorithm. If you use the BNL algorithm) the 10 rows of records read each time are placed in the join buffer, and then the 10 rows of data are directly matched in the InnerLoop table (internal table). The memory loop can be compared with the 10 rows at a time, in this way, we only need to compare 10 times and reduce the number of internal table scans by 9/10. Therefore, the BNL algorithm can significantly reduce the number of inner cycle table scans.
If join buffer is used for the query described above, the actual join diagram is as follows:
For each row in t1 matching range {
For each row in t2 matching reference key {
Store used columns from t1, t2 in join buffer
If buffer is full {
For each row in t3 {
For each t1, t2 combination in join buffer {
If row satisfies join conditions,
Send to client
}
}
Empty buffer
}
}
}
If buffer is not empty {
For each row in t3 {
For each t1, t2 combination in join buffer {
If row satisfies join conditions,
Send to client
}
}
}
If the column lengths of t1 and t2 involved in join are only s, and c is the combination of the two, the number of scans in Table t3 is
(S * C)/join_buffer_size + 1
The number of times of scanning t3 decreases with the increase of join_buffer_size until the join buffer can accommodate all t1 and t2 combinations, and then increases the join buffer size, so the query speed will not become faster.
2.3 MySQL uses Join Buffer with the following key points:
1. The join_buffer_size variable determines the buffer size.
2. The join buffer can be used only when the join type is all, index, and range.
3. Each join of the buffer will allocate a buffer, that is to say, a query may eventually use multiple join buffer.
4. The first nonconst table does not allocate a join buffer, even if its scan type is all or index.
5. The join buffer will be allocated before the join operation and will be released after the query is executed.
6. join buffer only stores the columns involved in join, not the entire data row.
3. How to Use
In version 5.6 and later versions, the block_nested_loop parameter in optimizer_switch controls whether BNL is used in the optimizer_switch. It is enabled by default. If it is set to off, the optimizer selects the NLJ algorithm when selecting the join method.
Iv. References
Version 5.6 BNL supports outer join and semi-join and is associated with other features such as BKA. Other optimization points will be written later.
Nested-Loop Join Algorithms
Block Nested-Loop and Batched Key Access Joins
Mysql join buffer
This article permanently updates the link address: