Oracle connection and semi-connection

Source: Internet
Author: User

Oracle JOIN and semi-JOIN] connecting ORACLE multi-table JOIN is divided into three categories: next loop, sort merge, and hash join. Each category is divided into three categories: traditional Join, Semi Join, and Anti Join. (The last two are called semi-Joins) nest loop: There are two tables, the Driving Table, and the Driving Table Driven Table. The driver table is traversed multiple times. The first record returned is fast and does not need to be sorted. You can use non-equivalent connections. Sort merge mode: the two tables have the same status. Each table must be sorted first and then merged to return the record set. Sorting is first performed In the Memory, and it is called Optimal Sort In the Memory, also called In-Memory Sort. If you need to use disk buffering, it is called External Sort. In external sorting, running run is an IO operation on the disk. If one input can complete the sorting of the entire dataset, it is called one-way sorting 1-Pass Sort. Multi-Pass Sort is a Multi-path sorting operation that requires multiple input and output operations. From the performance perspective, the OMem in the Optimal Sort> 1-Pass Sort> Multi-Pass Sort Execution Plan indicates the memory estimation required to use the Optimal Sorting. 1Mem: memory size required for sorting by 1-Pass. O/1/M: indicates the number of executions in the Optimal, 1-Pass, and Multi-Pass modes. Hash join: A drive table and a driven table. There are two phases in the process: preparation: hashing the connection fields of the driver table to generate a series of Hash Bucket (Hash Bucket) detection phases: each record of the driver table is sequentially uploaded, execute the same hash function on the connected fields and match with the drive table hash bucket. This process is called Probe ). Comparison of several methods: ORACLE performs sorting by Using Binary Tree Insertion Sorting Algorithm (Binary Insertion Tree ). In the INDEX of the memory, each node corresponds to a record, and each node also saves a pointer to the parent node and two child nodes. In this case, in a 32-bit system, this overhead is 12 bytes, and in a 64-bit system, this overhead is 24 bytes. The sorting process is a dual intensive operation of memory and CPU. Full memory sorting sometimes does not require fast disk sorting. If the CPU is a resource bottleneck and I/O is idle, reduce the size of the sorting space and use 1-Pass Sort. Especially when creating an index, you can reduce SORT_AREA_SIZE to improve performance. Because of memory sorting and disk sorting, Record Comparison operations are not much different, but in memory sorting, Binary Trees may be too high and CPU resources are too high. The memory consumption of hash join is much smaller than that of sort merge, and intensive CPU operations are not required. Therefore, the hash join algorithm is generally better than the sort merge algorithm. If the query focuses on the entire record but does not affect part of the record, hash join is very similar to nest loop, but better than nest loop, because the hash table is built in PGA and does not require LATCH protection. Semi-join is the deformation of IN, EXISTS, not in, not exists. The subquery in the from clause is called IN-line view, and the WHERE clause is called nested subquery (nested subquery ). IN, EXISTS, not in, not exists all belong to nested subqueries. For nested subqueries, ORACLE can process subqueries in two ways: Expand subqueries without expanding subqueries. For nested views, there are two ORACLE Processing Methods: Merge and not merge. The optimizer before ORACLE 10 Gb will be expanded before Optimization without cost evaluation. In and Exists are expanded to Semi-Join. Not Exists and Not In are converted to Anti-Join. For Inline-View or other views, Oracle will also try to Merge them into the primary query. This action is called Merge and the corresponding hint is sum. This can be confirmed in the execution plan. That is, if there is no VIEW, MERGE is merged; if there is VIEW, MERGE is not merged. For Subquery expansion, this process is called Subquery Unnesting. The difference between Merge and Unnest is that for clauses Distinct and Group by, Merge can be merged, called Complex View Merge. Set and Unnest are the same and cannot be merged. Lack of time-saving, no Complex View Merge. To achieve the Merge effect. When a subquery is merged into a primary query, the optimizer can determine the access path. Otherwise, ORACLE can only optimize queries in the outer memory separately. In addition, you can use the Semi-Join and Anti-Join connection modes provided by ORACLE. Not ALL subqueries can be expanded, such as connect by, start with, rownum pseudo columns, and set operators (UNION, union all, MINUS, and INTERSECT) aggregate functions (SUM, COUNT, and group by) are not expanded. Semi-join focuses on the fact that an external record is returned if a matching record is found in the internal table. Do not expand the query: similar to nest loop, A subquery is executed for each record in the master query, which is called FILTER in the execution plan. ORACLE 10 Gb, usage tips. (This is A non-semi-join) SQL> select idfrom awhere exists (SELECT 1 from B WHERE. ID = B. P _ ID); Expand subquery: SQL> select idfrom awhere exists (SELECT 1 from B WHERE. ID = B. P _ ID); in the execution plan, the word hash join semi is displayed, indicating that this is a SEMI-JOIN. The advantage is: if A record in Table A matches one in Table B, scanning B is stopped and processing the next record in Table A is switched. The returned results do not need to be de-duplicated. Even if the record A and record B is 1: n, each record of Table A will only be returned once. There is no difference between IN and EXISTS since ORACLE 9i, and the Execution Plan is the same. The HINT of semi join is as follows: EXISTS: SQL> select idfrom awhere exists (SELECT 1 from B WHERE. ID = B. P _ ID); IN: SQL> select idfrom awhere in (SELECT 1 from B WHERE. ID = B. P _ ID); not exists: the expanded ANTI-JOINSQL is used by default> select idfrom awhere not exists (SELECT 1 from B WHERE. ID = B. P _ ID); not in: the difference between nullnot in and not exists Is that nullnot in is processed to check whether null exists in the subresult. If null exists, FALSE is returned. not exists does NOT care whether null exists, only the number of records is concerned. If there is a record, FALSE is returned. Not in may be IN the matching column, causing performance problems because the index is invalid. HINT: operate Nest LoopHash JoinSort MeregJoinUSE_NLUSE_HASHUSE_MERGEAnti JoinNL_AJHASH_AJMERGE_AJSemiNL_SJHASH_SJMERGE_SJ

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.