Original address: https://www.cnblogs.com/Dreamer-1/p/6076440.html
One: What is the Oracle execution plan?
Execution plan is a description of the execution process or access path of a query statement in Oracle
Two: How to view Oracle execution plan?
Because I've been using the Plsql remotely connected company database, so here's an example of Plsql:
①: Configure the items that the execution plan needs to display:
Tools, preferences, window type, Project window , configure the columns to display in the execution plan as needed
The usual column fields for the execution plan are explained:
Cardinality (rows): The number of returned result set rows for the current operation estimated by Oracle
Bytes (Bytes): Number of bytes returned after performing this step
Cost, CPU consumption: Oracle estimates the execution costs of this step to illustrate the cost of SQL execution, and the smaller the theory the better (the value may differ from the actual)
Time: The time required for Oracle to estimate the current operation
②: Open Execution Plan:
Press F5 to view the execution plan for the query statement you just executed after executing a SELECT statement in the SQL window
Note: Some sql*plus commands plsql cannot be supported by using SQL commands in Plsql to view execution plans, such as SET AUTOTRACE on
III: Read Oracle execution plan
①: Execution Order:
According to the operation Indentation, the indentation is the most first execution; (at the same time, the topmost first executes)
Example: The index range Scan and the index UNIQUE scan two actions indent the most, the top index range scan executes first;
Same level if an action does not have a child ID, it is executed first
The same-level actions follow the most right -most-first-execution principle
Example: In table access by Global index ROWID and table access by index ROWID Two action indents are at the same level, then the table access by global index ROWID This action is performed first, and the action contains a Child action index range scan, then the child action on the right index range scan is executed first;
The SQL execution order in the diagram is:
Index RANGE scan-> table access by GLOBAL index rowid-> index UNIQUE scan-> TABLE access by index ROWID -NESTED LOOPS outer-> SORT GROUP by-> SELECT STATEMENT, GOAL = all_rows
(Note: Plsql provides a function button to view the order of execution (in the Red Box section))
②: Some descriptions of the actions in the diagram:
1. Middle TABLE ACCESS by ... This describes how the table accesses (or Oracle accesses the data) when the action is executed;
Table access in several ways: (Not all)
- Table ACCESS Full (All-table scan)
- Table access by ROWID (accessed through ROWID tables)
- TABLE ACCESS by index Scan (index scans)
(1) table ACCESS Full (whole table scan):
Oracle reads all rows in the table and checks to see if each row satisfies the Where constraint in the SQL statement;
Full table scan can use multi-block read (that is, one I/O read multiple blocks of data block) operation, improve throughput;
Usage Recommendation: Tables with large data volumes are not recommended for full table scans, unless they require more data to be fetched, accounting for 5% to 10% or more of the total table data
(2) Table Access by ROWID (accessed via ROWID table) :
First, what is ROWID?
ROWID is a column pseudo-column that is automatically added by Oracle to each row in the table, and since it is a pseudo-column, the table does not physically store the value of ROWID;
You can use it as you would with other columns, but you cannot increment, delete, or change the value of the column;
Once a row of data is inserted, its corresponding ROWID is unique throughout the life of the row, and the ROWID value of the row is unchanged even if a row migration occurs.
Let's go back to TABLE ACCESS by ROWID:
The rowid of the line indicates the data file, the data block, and the position of the row in the block, so the ROWID can quickly locate the target data, which is the fastest way to access the single row data in Oracle;
(3) TABLE ACCESS by indexScan:
In the index block, both the key value of each index is stored, and the rowid of the row with that key value is stored.
After a numeric column is indexed, the index may have a conceptual structure such as:
So the index scan is actually divided into two steps:
Ⅰ: Scan index to get corresponding ROWID
Ⅱ: Read data by ROWID to specific rows
----------------Index Scan Extension-------------------
There are five more types of index scans:
- Index unique Scan (indexed only)
- Index range Scan
- Index full scan
- Index fast full Scan (indexed quick Scan)
- Index skip Scan (indexed hop scanning)
A) index unique Scan (index unique scanning):
A scan of a unique index that returns at most one record at a time;
When a unique, PRIMARY KEY constraint exists for a field in a table, Oracle often implements a unique scan;
b) index range Scan (index range scanning):
Use an index to access multiple rows of data;
Three scenarios in which an index range scan occurs:
- A range operator is used on a unique index column (such as:> < <> >= <= between)
- On a composite index, queries are made using only a subset of columns (the query must contain a leading column or a full table scan)
- Any query that is made on a non-unique indexed column
c) index full Scan (index fully scanned):
When a full index scan is performed, the data that is queried must be available directly from the index (note that full index scans are only valid in CBO mode)
-----------------------Extended reading: Oracle Optimizer brief-----------------------
The optimizer in Oracle is an optimization tool for SQL analysis and execution that is responsible for generating and developing an execution plan for SQL.
There are two types of Oracle optimizer:
- RBO (rule-based optimization) rule-based Optimizer
- CBO (cost-based optimization) cost-based optimizer
RBO:
Rbo has strict rules of use, as long as you follow this set of rules to write SQL statements, regardless of the contents of the data table, it will not affect your execution plan;
In other words, Rbo is "insensitive" to data, which requires SQL writers to be aware of the rules;
Rbo has been used to Oracle 9i, starting with Oracle 10g, RBO has been completely discarded.
Cbo:
The CBO is a more reasonable and reliable optimizer than RBO, which completely replaces Rbo in Oracle 10g;
The CBO calculates the "cost" of a variety of possible execution plans, i.e. costs, from which the least expensive execution plan is chosen as the actual operating plan;
It relies on the statistics of database objects, the accuracy of statistical information will affect the CBO to make the best choice, that is, the data "sensitive."
---------------------------------------------------------------------
d) Index fast full scan (Index quick Scan):
Scanning all data blocks in an index is similar to index full SCAN, but one notable difference is that it does not sort the queried data (that is, the data is not returned in sort order)
e) index skip Scan (index hop scanning):
Oracle 9i is provided, sometimes the leading column of the composite Index (the first column in the index) does not appear in the query statement, ORALCE will also use the composite index, this time the use of the index SKIP SCAN;
When will the INDEX SKIP SCAN be triggered?
Prerequisites: The table has a composite index, and when the query has other columns other than the leading column (the first column in the index) as a condition, and when the optimizer mode is CBO
When Oracle discovers that the number of unique values for a leading column is small, each unique value is used as the entry for a regular scan, on top of which a lookup is made and the query is finally merged;
For example:
Suppose the table EMP has a ename (employee name), Job (job name), sex (gender) three fields, and a composite index such as CREATE INDEX idx_emp on emp (sex, ename, Job) is established;
Because the gender is only ' male ' and ' female ' two values, so in order to improve the index utilization, Oracle can be the composite index (' Male ', ename, Job), (' Female ', ename, job) These two composite indexes;
When querying the SELECT * from emp where job = ' Programmer ', the query is issued:
Oracle first enters sex as a ' male ' entry, when it uses the compound index (' Male ', ename, job) to find the entry for job = ' Programmer ';
Then enter the sex as ' female ' entrance, this time used to (' female ', ename, job) This compound index, find job = ' Programmer ' entry;
The result set from two portals that were finally merged into the query.
----------------------------------------------
2. In the NESTED LOOPS ... Describes the way the table is connected;
The join keyword is used to connect two tables, and only two tables can be connected at a time, and the steps of the join operation are generally serial (read in parallel when reading data from two tables that are connected);
The order of connections between the table (row source) has a significant impact on query efficiency, and the first Access table (driver table) applies some constraints (where filters) to obtain a smaller row source, which can increase the efficiency of the connection.
-------------------------extended reading: The Driver table (Driving table) and the matching table (probed table)-------------------------
Driver table (Driving table):
Tables that are accessed first when the table is connected, also called the outer table (Outer table), are used for NESTED LOOPS (nested loops) and hash joins (hash joins) ;
If the driver table returns more row data, it has a negative effect on all subsequent operations, so it is common to select a small table (a table that returns fewer rows after the where limit is applied) as the driver table.
Match table (probed table):
Also known as the inner-layer table (Inner table), when a row of specific data is fetched from the driver table, the table is searched for rows that meet the join criteria. Therefore, this table is generally a large table (a table that returns a larger number of rows after the where constraint is applied).
---------------------------------------------------------------------------------------------------------
Table joins in several ways:
- Sort Merge Join (sort-merge connection)
- NESTED LOOPS (Nested loops)
- Hash join (Hashed connection)
- Cartesian product (Cartesian product)
Note: The table that will be accessed first is called Row Source 1, and then the table participating in the connection is called Row Source 2;
(1) sort Merge Join (sort-merge connection):
Suppose there is a query: Select A.name, B.name from Table_a a joins Table_b b on (a.id = b.id)
Internal connection process:
A) generate the data required for row source 1 and sort the data according to the Connection Action Association column (a.id in the example)
b) to generate the data required for row Source 2, sort the data according to the corresponding Connection Operation Association column (b.id) in a)
c) The sorted rows on both sides are put together to perform a merge operation (scan the data sets on both sides and determine if they are connected)
Extended:
The connection speed can be greatly improved if the connection operation in the example is already sorted before the column a.id,b.id, because sorting is a time-consuming and resource-intensive operation, especially for tables with large amounts of data.
It is therefore possible to consider establishing an index on the a.id,b.id so that it can be pre-sequenced. unfortunately , because all the fields are included in the returned result set, the usual execution plan will not go into the execution plan even if the connection column is indexed, unless some specific columns are processed (such as querying only indexed columns, etc.).
Sort-Merge connected tables without driving order, who can be in front of them;
Sort-the connection conditions applicable to the merge connection are: < <= = > >=, the non-applicable connection conditions are: <> like
(2) NESTED LOOPS (nested loop):
Internal connection process:
A) Remove row 1 (the first row of data) from row Source 1, traverse all rows of row Source 2 and check for matches, and fetch the matching rows into the result set
b) Remove row 2 (the second row of data) of row source 1, traverse all rows of row Source 2 and check for matches, take out matching rows into the result set
c) ...
If Row source 1 (that is, the driver table) returns n rows of data, the row Source 2 is also traversed by the full table n times.
Because each row of row source 1 matches all rows of row Source 2, it is more efficient when row source 1 returns as few rows as possible and has efficient access to row source 2, such as establishing an appropriate index.
Extended:
The table with nested loops has a drive sequence, so be careful to select the appropriate driver table.
One of the advantages of having a nested loop connection is that you can return a connected row without having to wait until all the connection operations have finished processing to return the data, allowing for a quick response.
Whenever possible, use a restriction (where filter) to make the driver table (Row source 1) return as few rows as possible, and create a unique index on the Connection Operation Association column of the matching table (row Source 2) or a better-selective non-unique index. At this point, the execution efficiency of the nested loop connection becomes very high. If the driver table returns more rows, the connection is not efficient even if there is an index on the associated column of the matching table join operation.
(3) Hashjoin (hash join) :
Hash connections are only available for equivalent connections (that is, connection conditions =)
HASH join is not necessarily a full table scan when connecting two tables, it does not restrict the way the table is accessed;
Internal connection Process Brief:
A) Remove the data set from Row Source 1 (The driver table, also known as build table in the hash join) and build it into a hash table in memory (the hash key of the hash function is the Connection Operation Association column). Create a hash bitmap (bitmap)
b) Take out the data set of row Source 2 (Match table), the connection operation for each of the data in the column using the same hash function and find the corresponding a in the position of the data in the Hash table, in that position to check whether the matching data can be found
----------------extended reading: Hash table related----------------
Explanations from the wiki:
In computing, a hash table (hash map) are a data structure used to implement an associative array, a structure that can map Keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from W Hich the desired value can be found.
Hash (hash) technology: In the record storage location and the record has a keyword key between the establishment of a corresponding relationship F , so that the input key, you can get the corresponding storage position F (key), the corresponding relationship F Is the hash (hash) function;
Using hashing technology to store records in a contiguous storage space, this contiguous storage space is a hash table (hash table);
The hash values obtained by hashing the same hash function in different keys should theoretically be different, but the actual may be the same, there is a hash (hash) collision in the same time, there are many ways to solve the hash conflict, such as HashMap is to use the chain address method to solve the conflict;
A hash table is a lookup-oriented data structure that looks at the position of a given value corresponding to a record in a table to obtain a specific record after entering a given value, which is fast.
--------------------------------------------------------
Three modes of HASH join:
- OPTIMAL HASH JOIN
- ONEPASS HASH JOIN
- Multipass HASH JOIN
1) OPTIMAL HASH JOIN:
OPTIMAL mode is a small result set obtained from the driver table (also known as the Build table), and the entire hash table built from the result set can be built into the memory area that the user can use.
Overview of the connection process:
Ⅰ: First, the connection operation of the row data in the build table uses the hash function to construct the result set of the build table into an in-memory hash table. , you can think of the hash table as a large square area in memory, with a lot of small squares in it, and the data in the Build table is scattered across these small squares, which are hash buckets (see Wiki definition above).
Ⅱ: Begins reading data from the matching table (probed table), and uses the same hash function for the Connection Operation Association column for each row of data, locating the hash Bucket with the same value data as the hash function in the build table.
Ⅲ: After locating the specific hash bucket, first check if there is data in the bucket and lose the line of the matching table (probed table) immediately. If there is data in it, continue to check if the data inside (the data that drives the table) matches the data of the matching table.
2): ONEPASS HASH JOIN :
The ONEPASS mode is used when the result set obtained from the driver table (also called Build table) is large, and the hash table built from the result set cannot be put into memory.
Overview of the connection process:
Ⅰ: The connection operation of the row data in the build table uses the hash function, and after constructing the hash table based on the result set of the build table, the memory cannot drop all the hash table contents, which will result in some hash buckets in memory. Some hash buckets are placed on disk, either in memory or on disk, and Oracle uses a bitmap structure to reflect the state of these hash buckets (including their location and whether there is data).
Ⅱ: Reads the matching table data and associates the column with the same hash function on each row, locating the bucket with the same value data as the hash function on the build table in bitmap. If the bucket is empty, the data that matches the table is discarded. If it is not empty, you need to see if the bucket is in memory or on disk.
If in memory, the bucket is accessed directly and the data in it is checked to see if it matches, and the query results are returned if there is a match.
If on the disk, first put this to match the data aside, will be its first temporarily in memory, and so on after accumulating a certain amount of such data to be matched, then batch of this data written to the disk (Dump probe partitions to disks).
Ⅲ: After a complete scan of the matching table, a portion of the matching data may have been returned. And then there's a hash table part of the hash bucket data on the disk and the matching data that is partially written to disk in the match table is not processed, and now Oracle will re-match the two pieces of data on the disk once and then return the final query results.
3): multipass HASH JOIN:
The Multipass mode is used when the memory is particularly small or the data of the hash table is particularly large . Multipass will read disk data multiple times and should avoid using this mode as much as possible .
3. in the ... OUTER describes the table connection type;
There are two types of table joins:
- INNER Join (inner connection)
- OUTER join (outer join)
Sample Data Description:
Existing A, b two tables, a table information as follows:
The B table information is as follows:
The following examples are shown in both A and B tables.
(1) INNER join (inner connection):
only the matching records in the two tables are returned .
INNER joins are divided into two types:
- Equivalent connection (connection condition = )
- Non-equivalent connections (connection conditions are non-= , such as > >= < <= , etc.)
The equivalent connection uses the most, the following with the equivalent connection example:
Two types of internal connections:
Ⅰ: Select a.ID a_id, A.name a_name, b.ID b_id, B.name b_name from a a inner joins b b on (a.id = b.id)
Ⅱ: Select a.ID a_id, A.name a_name, b.ID b_id, B.name b_name from A-a join b b on (a.id = b.id)
Only records that meet the join condition (a.id = b.id) are returned when connecting:
(2) OUTER Join (outer connection):
OUTER joins are divided into three types:
- Left OUTER join (can be shortened to a right-hand join, with an outer join)
- Right OUTER join (starboard join, left OUTER join)
- Full OUTER Join (complete join, all-out connection)
A) left join (right-hand connection):
The returned results include not only records that meet the join criteria, but also all records in the table on the left . (If a row in the returned left table does not have a match in the right table, the returned column in the right table is a null value)
Two types of notation:
Ⅰ:select a.id a_id, A.name a_name, b.ID b_id, B.name b_name from a-a left outer join B-B on (a.id = b.id)
Ⅱ:select a.id a_id, A.name a_name, b.ID b_id, B.name b_name from a-a LEFT join B-B on (a.id = b.id)
return Result:
b) right join:
The returned results include not only records that meet the join criteria, but also all records in the right table . (If a row in the right table returned does not have a match in the left table, the returned columns in the left table are null values)
Two types of notation:
Ⅰ:select a.id a_id, A.name a_name, b.ID b_id, B.name b_name from a-a right outer join B-B on (a.id = b.id)
Ⅱ:select a.id a_id, A.name a_name, b.ID b_id, B.name b_name from a-a right join B-B on (a.id = b.id)
return Result:
c) full join (fully connected):
returns all records for the left and right tables . (both left and right mismatched entries are replaced with null values)
Two types of notation:
Ⅰ:select a.id a_id, A.name a_name, b.ID b_id, B.name b_name from a-a full outer join B-B on (a.id = b.id)
Ⅱ:select a.id a_id, A.name a_name, b.ID b_id, B.name b_name from a-a full join B-B on (a.id = b.id)
return Result:
---------------------Extended reading: (+) operator-------------------
The (+) operator is an Oracle-specific notation that represents an outer join (which can only represent left and right outer joins) and needs to be used with the where statement.
Special Note: the(+) operator represents a right connection on the join condition of the left table, and represents a left connection on the join condition of the right table .
Such as:
Ⅰ:select a.id a_id, A.name a_name, b.ID b_id, B.name b_name from a A, b b where a.id = b.ID (+)
Query Result:
Actual with left connection select a.ID a_id, A.name a_name, b.ID b_id, B.name b_name from a-a LEFT join B-B on (a.id = b.id) Effect equivalent
Ⅱ:select a.id a_id, A.name a_name, b.ID b_id, B.name b_name from a A, b b where a.id (+) = b.ID
Query Result:
Actual with right connection select a.ID a_id, A.name a_name, b.ID b_id, B.name b_name from a-a right join B-B on (a.id = b.id) Effect equivalent
----------------------------------------------------------
Add:
Self-connect (by assigning two different aliases to a table to connect to itself or outside)
[GO] read Oracle execution plan