How Oracle tables are connected (2)-----Basic mechanism of HASH join 3

Source: Internet
Author: User

The mode of the HASH join

Hash join has three kinds of working modes, namely optimal mode, OnePass mode and Multipass mode, respectively, there are corresponding statistic information in V$sysstat:

Sql> select name, value from V$sysstat where name is like '%workarea executions% ';

Optimal mode

The optimal pattern is a small set of results obtained from the build table, which allows the entire hash table to be built into the memory area that the user can use. The following figure is used to describe Optimal's hash join:

The optimal hash join is broadly divided into the following steps:

1. First use the hash function on the concatenated column to make the result set of the build table into a hash table in memory, where the hash bucket is always 2 of the n-th square, such as 1024 or 4096. You can simply think of hash table as a large square in memory, you have a lot of small squares on it, and the data on the build table is distributed in these squares, and these squares are hash buckets.

2. Oracle begins to read probe table, using the hash function on the column that each piece of data is connected to, locating the hash bucket corresponding to the same value of the build table, and finding the corresponding hash bucket and going to that position to see if there are any matching data. This process is called probing hash table.

3. When checking the Bush bucket, if there is no data in the bucket, throw away the line of probe table immediately. If there is data in the bucket, it is necessary to further check whether the data in it matches the current data of proble table. Before we introduce the filter and hash cluster table, we mentioned that there is a conflicting problem with the hash function, that is, the hash key is completely possible to correspond to the same hash bucket, so when we are probe When a value on table is anchored to a hash bucket, we need to do a further check to see what data we need in this bucket and what data we don't need. In the ideal case, we hope that each hash bucket inside a maximum of only one hash key data, so often hash bucket number is more than the number of hash key.

OnePass hash Join mode

We know that the optimal hash join occurs when we can put the entire hash table in memory, so that all join operations can be done in memory, which is our ideal pattern. However, when our memory is unable to drop the entire hash table, we have to hash join in OnePass mode.

1. First, because memory can not put down all the hash table content, it will result in some hash bucket in memory, some hash bucket on disk, but wherever it is placed, Oracle uses a bitmap structure to reflect the state of these buckets , including the location and whether there is data inside.

2. When our probe table uses the hash function on the connected column, first go to bitmap to see if the corresponding bucket is empty, and if it is empty, the data will be discarded. If it is not empty, it depends on whether the bucket is in memory or on disk. If it is in memory, it accesses the bucket directly and checks for data matching, and returns the result if there is a match. The second case is if the bucket to be accessed is on disk, and if the access to the disk directly is obviously very large, Oracle's approach is to put this probe data aside. By the way, the value of this probe will be placed in memory first, if later accumulated a certain amount of other probe on the data, Oracle will write these data in bulk to disk, this is the graph of dump probe partitions to disk.

3. After we have scanned the probe intact, we may have returned some of the matching data, but we now have two pieces of data that are not processed on the disk: part of the hash table of the build table and part of the data for probe table, Now Oracle re-makes the two pieces of data hash join (this time will be re-comparison of whose result set is relatively small, so the original build table may become probe table, the original probe table into a build table), The final query results are then returned. This is the approximate process of onepass hash join.

Multipass Hash Join mode

Finally, if our memory is particularly small or the data that needs to be hashed is particularly large, the hash join will be executed in the worst way: multipass Hash join. If OnePass is only required to do probe table reads more than once from disk, then Multipass needs to do multiple reads, which often occurs when there is a significant difference in the amount of available memory and data. Multipass Hash join is something we need to avoid as much as possible.

A single optimization of the hash join:

Http://www.cnblogs.com/killkill/archive/2010/07/22/1782889.html

--Organized from the network

How Oracle tables are connected (2)-----Basic mechanism of HASH join 3

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.