Nested-loop Join Algorithms

Last Update:2018-07-29 Source: Internet

Author: User

Tags joins

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MySQL uses nested loop algorithms to implement joins between multiple tables. nested-loop Join Algorithms

A simple nested loop join (NLJ) algorithm that loops through rows from the first table, takes each row, and loops through the next table in the join. This process repeats many times until the remaining tables are joined.
Suppose the table T1, T2, and T3 join with the following join types:

Table   Join Type
T1      range
T2      ref
T3      All

If you are using a simple NLJ algorithm, the join process is like this:

For each row in T1 matching range {For each
  row in T2 matching reference key {for each
    row in T3 {
      if row Satisfies join conditions,
          Send to Client
    }
  }

Because the NLJ algorithm is a row that matches the inner loop through the outer loop, the inner loop table is scanned multiple times. Block nested-loop Join algorithm

A block nested cyclic join (BNL) algorithm that caches rows from the outer loop, reads rows from the cache, and reduces the number of times the inner loop table is scanned. For example, if 10 rows are read into the buffer and the buffer is passed to the next inner loop, each row read within the loop can be compared to the 10 rows of the buffer. This reduces the number of times that the inner loop table is scanned by one order of magnitude.
When MySQL uses a join buffer, the following principles follow:

The value of the join_buffer_size system variable determines the size of each join buffer.

When the join type is all, index, range (in other words, when the join process scans the index or data), MySQL uses the join buffer.

A buffer is allocated to every join that can be buffered, so a query might use multiple join buffers.

The join buffer is never assigned to the first table, even if the table's query type is all or index.

Allocated before the join buffer join, and released after the query completes.

The columns that are used are placed in the join buffer, not all columns.

The above example uses the NLJ algorithm (without caching) and uses a cached join as follows:

For each row in T1 matching range {
  for each row in T2 matching reference key {
    store used columns to T1, T2 in Join buffer
    If buffer is full {
      for each row in T3 {
        for each T1, T2 combination in join buffer {
          if row s Atisfies join conditions,
          Send to Client
        }
      }
      empty buffer
    }
  }

if buffer isn't Empty {For each
  row in T3 {for each
    T1, T2 combination in join buffer {
      If row satisfies join conditions,
      Send to Client
    }
  }

The above process is explained as follows:
1. Place the T1, T2 join results into the buffer until the buffer is full;
2. Traverse T3, internal recycle buffer, and find matching row, send to client;
3. Empty buffer;
4. Repeat the above steps until the buffer zone is dissatisfied;
5. Process the remaining data in the buffer, repeat step 2.

Set S is the size of each storage T1, T2 combination, C is the number of combinations, then the number of T3 is scanned:

(S * C)/join_buffer_size + 1

Thus, with the increase of join_buffer_size, T3 will be scanned less often, if the join_buffer_size large enough to accommodate all T1 and T2 join the data produced, the T3 will only be scanned 1 times.

English Address: http://dev.mysql.com/doc/refman/5.5/en/nested-loop-joins.html
This article from: Gao | Coder, original address: http://blog.csdn.net/ghsau/article/details/43762027, reprint please specify.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More