MySQL Nested-Loop Join algorithm learning, nested-loopjoin

Source: Internet
Author: User

MySQL Nested-Loop Join algorithm learning, nested-loopjoin

After playing MySQL for more than two years, I found that many people say that MySQL is inferior to Oracle in terms of optimizer. In fact, to some extent, it is true, however, after all, MySQL is only available in version 5.7, and Oracle has been developed to 12c. Today I have looked at the MySQL connection algorithm. Well, Hash Join is not supported yet, only Nested-Loop Join is available. Let's summarize my learning experience today.

The basic Algorithm Implementation of Nested-Loop Join is as follows:

for each row in t1 matching range {  for each row in t2 matching reference key {    for each row in t3 {      if row satisfies join conditions,      send to client    }  }}

This code is very simple. Although I do not write much code, I still understand it. Here we assume there are three tables, t1, t2, t3, and this Code respectively show the range, ref, and ALL in the explain Plan, which are shown in the SQL Execution Plan layer, t3 will perform a full table scan. Today I saw a cool SQL Optimization Method in this place, Straight-join: optimize. We also mentioned that, by narrowing down the result set of the driver table for connection optimization, we can see that the driver table with a small result set can indeed reduce the number of loops.

Of course, on the basis of this algorithm, MySQL introduced the Block Nested-Loop join algorithm. In fact, it is basically no different from the above algorithm. The pseudo code is as follows:

for each row in t1 matching range {  for each row in t2 matching reference key {    store used columns from t1, t2 in join buffer    if buffer is full {      for each row in t3 {        for each t1, t2 combination in join buffer {          if row satisfies join conditions,          send to client        }      }      empty buffer    }  }}if buffer is not empty {  for each row in t3 {    for each t1, t2 combination in join buffer {      if row satisfies join conditions,      send to client    }  }}

This algorithm caches the data of the outer loop in the join buffer, and compares the data in the buffer of the table round in the inner loop to reduce the number of cycles, thus improving the efficiency. There is a example on the official website, which I do not understand: If 10 rows are cached in the buffer, these 10 rows are passed to the inner loop, all rows in the inner loop are compared with the 10 rows in the buffer. The original article is as follows:

For example, if 10 rows are read into a buffer and the buffer is passed to the next inner loop, each row read in the inner loop can be compared against all 10 rows in the buffer

If S refers to the size of t1 and t2 combinations in the cache, and C is the number of these combinations in the buffer, the number of times the t3 table is scanned should be:

(S * C)/join_buffer_size + 1

According to this formula, the larger join_buffer_size, the smaller the number of scans. If join_buffer_size is used to cache all the previous row combinations, this is the best performance time, it will have no effect if it is increased later.

All the learning is done by reading the documentation.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.