10 Table connection Optimization

Source: Internet
Author: User

Summary
-----------------------------------------
The query optimizer needs to determine the connection sequence and method of multiple tables to filter unnecessary data as soon as possible and reduce the data volume to be processed.
This chapter introduces three basic Connection Methods: nested loop connections, merge connections, and hash connections.
Basic suggestions on how to select connections.
-----------------------------------------

10.1 Definition
Connection tree
All the connection methods supported by the database engine can only process two datasets at a time. They are called left nodes and right nodes. When you need to connect more than two datasets,
The query optimizer evaluates different connection trees.
1) The most common connection tree of the left deep tree Query Optimizer
Each of its connections has a table (table rather than the result set generated in the previous table) as the right node.

-------------------------------------
| ID | operation | Name |
-------------------------------------
| 1 | hash join |
| 2 | hash join |
| 3 | hash join |
| 4 | table access full | T1 |
| 5 | table access full | T2 |
| 6 | table access full | T3 |
| 7 | table access full | T4 |
-------------------------------------
2) The right-depth tree is rarely used by the query optimizer.

3) sawing tree
Each join has at least one table as the input, but the input based on this table is sometimes on the left, sometimes on the right, rarely used.

4) dense trees
Its two inputs may not be tables. That is to say, the structure of this tree is completely free. The query optimizer selects it only when there is no other choice.


Restrictions and connection conditions
In fact, in the traditional connection syntax, the where clause is also used to indicate the connection conditions and restrictions. On the contrary, in the ANSI-standard connection syntax, the restriction condition is in the where
Clause, and the join condition is defined in the from clause.
~ The two datasets are connected by connection conditions;
~ Apply constraints on the result set returned by the connection.
That is to say, when two tables are connected, the join condition is to prevent cross join (Cartesian Product). The corresponding condition is to filter the previous operation (such as the table
Result set, for example.

-- EMP. deptno = Dept. deptno is the connection condition -- Dept. loc = 'Dallas 'restriction condition select EMP. ename from EMP, DEPT where EMP. deptno = Dept. deptno and dept. loc = 'Dallas'
10-1


On the one hand, connection conditions can be used to filter data. On the other hand, in order to minimize the amount of data used by connections, restrictions may be evaluated before connection conditions,
For example, in the above example, although the writing position is restricted by the connection conditions, note that the restriction condition Dept. Loc = 'dall' is applied before the connection conditions.
10.2 nested loop connection (nested loops)
It is divided into an External Loop and an internal phantom. The External Loop is the left subnode, and the internal loop is the right subnode. When an External Loop is executed once, the internal loop needs to be targeted at the External Loop.
Each record returned is executed once. nested loops have the following features:
~ The left subnode (External Loop) is executed only once, while the right subnode (internal loop) is generally executed many times.
~ The first record of the result set can be returned before all data is processed.
~ Indexes can be effectively used to process restrictions and connection conditions.
~ Supports all types of connections.
10.3 merge connections
When the merged connection is processed, both datasets are read and sorted by the connection condition fields. After these operations are completed, the content of the two workspaces is
Merge, features:
~ Each sub-node is executed only once.
~ Each input dataset must be sorted by the fields of the connection condition.
~ Because of these sorting operations, both input datasets must be fully read and sorted before the first record of the returned result set is returned.
~ All connection types support merged connections.
There are also hash connections and external connections.
10.6 select the connection method
Consider the following:
~ Optimizer goals: first_rows or all_rows
~ Connection type and optional predicate Conditions
~ Can parallel connections be executed?



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.