Oracle-how to analyze execution plans

Source: Internet
Author: User

Example 1:
If LARGE_TABLE is a large table with no index on the username column, run the following statement:
SQL> SELECT * FROM LARGE_TABLE where USERNAME = 'test ';
Query Plan
-----------------------------------------
Select statement Optimizer = CHOOSE (Cost = 1234 Card = 1 Bytes = 14)
Table access full LARGE_TABLE [: Q65001] [ANALYZED]

In this example, table access full LARGE_TABLE is the first operation, which means to perform a full table scan on the LARGE_TABLE TABLE. After this operation is completed, the data in the generated row source is sent to the next step for processing. In this example, the select statement operation is the last step of the query STATEMENT.

Ptimizer = CHOOSE indicates the optimizer_mode of the query, that is, the value specified by the optimizer_mode initialization parameter. It does not mean that the optimizer is used when the statement is executed. The only way to determine which optimizer the statement uses is to look at the cost section below. For example, if the following format is provided, it indicates that the CBO optimizer is used, and the cost here indicates that the optimizer considers the cost of the execution plan:
Select statement Optimizer = CHOOSE (Cost = 1234 Card = 1 Bytes = 14)

However, if the information shown in the execution plan is similar to the following, it indicates that the RBO optimizer is used because the value of the cost part is null or there is no cost part at all.
Select statement Optimizer = CHOOSE Cost =
Select statement Optimizer = CHOOSE
In this way, we can find out the Optimizer used to execute the statement from the information after Optimizer. Specifically, if Optimizer = ALL_ROWS | FIRST_ROWS | FIRST_ROWS_n, the CBO Optimizer is used. If Optimizer = RULE, the RBO Optimizer is used.

The value of the cost attribute is a value that is used internally in oracle to compare the cost of each execution plan, so that the optimizer can select the best execution plan. The cost values of different statements are not comparable. You can only compare the cost values of different execution plans of the same statement.

[: Q65001] indicates that the query is run in parallel. The data in it indicates that this operation is processed by a slave process in parallel query, so that this operation can be different from the operations performed in serial mode.

[ANALYZED] indicates that the object referenced in the operation has been ANALYZED, and the statistical information of this object is available for CBO in the data dictionary.

Example 2:
Assume that A, B, and C are not small tables, and A composite index on Table A: a (a. col1, a. col2). Note that the. col1 column is the index guide column.
Consider the following query:
Select A. col4
From A, B, C
Where B. col3 = 10 and A. col1 = B. col1 and A. col2 = C. col2 and C. col3 = 5
Execution Plan
----------------------------------------------------------
0 select statement ptimizer = CHOOSE
1 0 MERGE JOIN
2 1 SORT (JOIN)
3 2 NESTED LOOPS
4 3 table access (FULL) OF 'B'
5 3 table access (by index rowid) OF 'A'
6 5 INDEX (range scan) OF 'partition _ col12a' (NON-UNIQUE)
7 1 SORT (JOIN)
8 7 table access (FULL) OF 'C'

Statistics
----------------------------------------------------------
0 recursive cballs
8 db block gets
6 consistent gets
0 physical reads
0 redo size
551 bytes sent via SQL * Net to client
430 bytes encoded ed via SQL * Net from client
2 SQL * Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
6 rows processed

When a table is connected, only two tables can be connected first, and then the connected result is used as a row source to connect to the remaining tables. In the preceding example, the connection sequence is B and A, and then C:
B <---> A <---> C
Col3 = 10 col3 = 5

If there is no execution plan, analyze which of the above three tables should be used as the first driver table? From the perspective of SQL statements, only table B and Table C have restrictions. Therefore, the first driver table should be one of the two tables. Which one is it?

Table B has a predicate B. col3 = 10. In this way, when a full table scan is performed on table B, the restriction conditions in the where clause (B. col3 = 10) to get a smaller row source. Therefore, table B should be used as the first driver table. In this case, if you associate with table A, you can effectively use the index of Table A (because the col1 column of Table A is leading column ).

Of course, in the above query, Table C also has a predicate (C. col3 = 5). Some people may think that table C can also achieve better performance as the first driver table. Let's analyze it again: if Table C is the first driver table, it can ensure that the driver table generates A small row source, but let's look at connection Condition. col2 = C. col2, there is no chance to use the index of Table A at this time, because the col2 column of Table A is not leading column, so the efficiency of the nested loop is very poor, resulting in poor query efficiency. Therefore, it is important to select the correct driver table for the NL connection.

Therefore, the above query results are in A better connection order (B-> A)-> C. If the database is a cost-based optimizer, it will use the calculated cost to determine the proper drive table and the proper connection sequence. In general, CBO selects the correct connection sequence. If CBO selects a poor connection sequence, we can also use hints provided by ORACLE to make CBO adopt the correct connection sequence. As follows:

Select/* + ordered */A. col4
From B, A, C
Where B. col3 = 10
And A. col1 = B. col1
And A. col2 = C. col2
And C. col3 = 5

Since it is so important to select the correct driver table, let's take a look at the execution plan and how the tables are associated to obtain which table in the execution plan should be the driver table:
In the execution plan, you need to know which operation is performed first and which operation is performed later, which is useful for determining which table is the driving table. Before determining, if the table is accessed through rowid and the value of rowid is obtained from the index scan, the index scan is temporarily removed from the execution plan. Then, in the remaining part of the Execution Plan, the guiding principle for determining the execution sequence is: the rightmost and rightmost operations are executed first. The specific explanation is as follows:
Get the execution plan after removing the index scan that hinders judgment:
Execution Plan
----------------------------------------------------------
0 select statement ptimizer = CHOOSE
1 0 MERGE JOIN
2 1 SORT (JOIN)
3 2 NESTED LOOPS
4 3 table access (FULL) OF 'B'
5 3 table access (by index rowid) OF 'A'
7 1 SORT (JOIN)
8 7 table access (FULL) OF 'C'
Check the execution plan's 3rd columns, namely, the letter section. The left side of each column value contains spaces as indentions. The more spaces on the left of the column value, the more indent the column value and the more right the column value is. As shown in the preceding execution plan, the rows with the first column value of 6 have the most indentation, that is, the row is the rightmost; the rows with the first column value of 4 and 5 have the same indentation, the degree to which the column is right is the same, but the row with the first column value 4 is closer to the row with the first column value 5. When talking about the upper and lower relations, valid only for consecutive rows with consistent indentation.

From this figure, we can see that for the nested loops part, the rightmost and rightmost operations are table access (FULL) OF 'B', so this operation is executed first, therefore, table B corresponding to this operation is the first drive table (External table). Naturally, Table A is the internal table. We can also see that the nested loop between table B and table A generates A new row source. After sorting the row source, the row source corresponding to the C table. col3 = 5 constraints. Therefore, the following facts can be obtained: Table B first performs A nested loop with table A, and then sorts and merges the generated row source with Table C.

By analyzing the preceding execution plan, we cannot say that table C must be read only after table B and table A. In fact, table B and Table C may be read into the memory at the same time, because the operation of reading data from a table into memory may be parallel. In fact, many operations may be performed in parallel, because when ORACLE reads data, if a row of data is required, it also reads the entire data block of the row into the memory, and may read multiple data blocks.
When we look at the execution plan, the key is not to see which operation is executed first and which operation is executed later, but to view the connection sequence between tables (for example, to know which is the driver table, this needs to be determined from the operation order), the type of association used, and the specific access path (for example, whether an index is used)

After determining which table is the driving table from the execution plan, we can determine whether the table is suitable as the driving table (just like the ABC table above) based on our knowledge. If not, modify the SQL statement so that the optimizer can select the correct driver table.

For the RBO optimizer:
In the ORACLE document, for RBO, select the driver table from right to left in the from clause, that is, the rightmost table is the first driver table: all things being equal RBO chooses the driving order by taking the tables in the FROM clause RIGHT to LEFT. However, in my tests, I have never verified that this statement is correct. In my opinion, even in RBO, there is also a set of rules to decide which connection type and which table to use as the driving table. In the selection, the current index will definitely be taken into account, the restriction condition in the where clause may also be considered, but it must be irrelevant to the location of the restriction condition in the where clause.

Test:
If I create three tables:
Create table A (col1 number (4, 0), col2 number (4, 0), col4 char (30 ));
Create table B (col1 number (4, 0), col3 number (4, 0), name_ B char (30 ));
Create table C (col2 number (4, 0), col3 number (4, 0), name_c char (30 ));
Create index inx_col12A on a (col1, col2 );
Run the query:
Select A. col4
From B, A, C
Where B. col3 = 10
And A. col1 = B. col1
And A. col2 = C. col2
And C. col3 = 5;
Execution Plan
----------------------------------------------------------
0 select statement ptimizer = RULE
1 0 MERGE JOIN
2 1 SORT (JOIN)
3 2 NESTED LOOPS
4 3 table access (FULL) OF 'B'
5 3 table access (by index rowid) OF 'A'
6 5 INDEX (range scan) OF 'partition _ col12a' (NON-UNIQUE)
7 1 SORT (JOIN)
8 7 table access (FULL) OF 'C'

Select A. col4
From B, A, C
Where A. col1 = B. col1
And A. col2 = C. col2;
Execution Plan
----------------------------------------------------------
0 select statement ptimizer = RULE
1 0 MERGE JOIN
2 1 SORT (JOIN)
3 2 NESTED LOOPS
4 3 table access (FULL) OF 'B'
5 3 table access (by index rowid) OF 'A'
6 5 INDEX (range scan) OF 'partition _ col12a' (NON-UNIQUE)
7 1 SORT (JOIN)
8 7 table access (FULL) OF 'C'

After the index inx_col12A of Table A is deleted:
Select A. col4
From B, A, C
Where A. col1 = B. col1
And A. col2 = C. col2;
Execution Plan
----------------------------------------------------------
0 select statement ptimizer = RULE
1 0 MERGE JOIN
2 1 SORT (JOIN)
3 2 MERGE JOIN
4 3 SORT (JOIN)
5 4 table access (FULL) OF 'C'
6 3 SORT (JOIN)
7 6 table access (FULL) OF 'A'
8 1 SORT (JOIN)
9 8 table access (FULL) OF 'B'

Using the above examples, I am skeptical about the "All things being equal RBO chooses the driving order by taking the tables in the FROM clause RIGHT to LEFT" statement in the oracle document. At this point, I cannot use hints to force the optimizer to use the nested loop. If hints is used, the CBO optimizer is automatically used instead of the RBO optimizer.

For the CBO optimizer:
CBO selects the driver table based on the statistical information. If no statistical information is available, select the driver table from left to right in the from clause. This is in the opposite order of RBO selection. This is the original English version (CBO determines join order from costs derived from gathered statistics. if there are no stats then CBO chooses the driving order of tables from LEFT to RIGHT in the FROM clause. this is OPPOSITE to the RBO ). I still cannot confirm the correctness of this sentence. However, after verification, "if you use the ordered prompt (CBO at this time), select the driver table in the order from left to right in the from clause" is correct. In fact, in CBO, if there is statistical data (that is, the table and index are analyzed), the optimizer will automatically decide which connection type to use based on the cost value and select the appropriate driver table, this has nothing to do with the locations of each restriction condition in the where clause. If we want to change the connection type or driver table selected by the optimizer, we need to use hints. The usage of hints will be introduced later.

Test:
If I create three tables:
Create table A (col1 number (4, 0), col2 number (4, 0), col4 char (30 ));
Create table B (col1 number (4, 0), col3 number (4, 0), name_ B char (30 ));
Create table C (col2 number (4, 0), col3 number (4, 0), name_c char (30 ));
Create index inx_col12A on a (col1, col2 );

Run the query:
Select A. col4
From B, A, C
Where B. col3 = 10
And A. col1 = B. col1
And A. col2 = C. col2
And C. col3 = 5;
Execution Plan
----------------------------------------------------------
1 select statement ptimizer = ALL_ROWS (Cost = 3 Card = 1 Bytes = 110)
1 0 nested loops (Cost = 3 Card = 1 Bytes = 110)
2 1 merge join (CARTESIAN) (Cost = 2 Card = 1 Bytes = 52)
3 2 table access (FULL) OF 'B' (Cost = 1 Card = 1 Bytes = 26)
4 2 SORT (JOIN) (Cost = 1 Card = 1 Bytes = 26)
5 4 table access (FULL) OF 'C' (Cost = 1 Card = 1 Bytes = 26)
6 1 table access (FULL) OF 'A' (Cost = 1 Card = 82 Bytes = 4756)

Select A. col4
From B, A, C
Where A. col1 = B. col1
And A. col2 = C. col2;
Execution Plan
----------------------------------------------------------
0 select statement ptimizer = ALL_ROWS (Cost = 5 Card = 55bytes = 4620)
1 0 hash join (Cost = 5 Card = 55 Bytes = 4620)
2 1 hash join (Cost = 3 Card = 67 Bytes = 4757)
3 2 table access (FULL) OF 'B' (Cost = 1 Card = 82 Bytes = 1066)
4 2 table access (FULL) OF 'A' (Cost = 1 Card = 82 Bytes = 4756)
5 1 table access (FULL) OF 'C' (Cost = 1 Card = 82 Bytes = 1066)

After the index inx_col12A of Table A is deleted:
Select A. col4
From B, A, C
Where A. col1 = B. col1
And A. col2 = C. col2;
Execution Plan
----------------------------------------------------------
0 select statement ptimizer = ALL_ROWS (Cost = 5 Card = 55bytes = 4620)
1 0 hash join (Cost = 5 Card = 55 Bytes = 4620)
2 1 hash join (Cost = 3 Card = 67 Bytes = 4757)
3 2 table access (FULL) OF 'B' (Cost = 1 Card = 82 Bytes = 1066)
4 2 table access (FULL) OF 'A' (Cost = 1 Card = 82 Bytes = 4756)
5 1 table access (FULL) OF 'C' (Cost = 1 Card = 82 Bytes = 1066)

Select/* + ORDERED */A. col4
From C, A, B
Where B. col3 = 10
And A. col1 = B. col1
And A. col2 = C. col2
And C. col3 = 5;
Execution Plan
----------------------------------------------------------
1 select statement ptimizer = ALL_ROWS (Cost = 3 Card = 1 Bytes = 110)
1 0 nested loops (Cost = 3 Card = 1 Bytes = 110)
2 1 nested loops (Cost = 2 Card = 1 Bytes = 84)
3 2 table access (FULL) OF 'C' (Cost = 1 Card = 1 Bytes = 26)
4 2 table access (FULL) OF 'A' (Cost = 1 Card = 82 Bytes = 4756)
5 1 table access (FULL) OF 'B' (Cost = 1 Card = 1 Bytes = 26)
This query verifies that the table selected by the optimizer can be correctly prompted by the ORDERED prompt.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.