Optimizing SQL by analyzing execution plans for SQL statements (ii)

Source: Internet
Author: User
Tags count create index execution hash implement include sort variable
optimization | statement | execution

5th. Oracle's execution plan

background knowledge:

To better carry out the following, we must understand some conceptual terms:

   Sharing SQL statements

In order not to parse the same SQL statement repeatedly (because parsing operations cost resources, resulting in performance degradation), after the first parsing, Oracle stores SQL statements and parsed execution plans in memory. This memory in the shared buffer pool, which is located in the system global Zone's SGA (System global Area), can be shared by all database users. So, when you execute an SQL statement (sometimes referred to as a cursor), if the statement is exactly the same as the one previously executed, and the statement that was executed before is still in memory, Oracle does not need to analyze it to get the execution path of the statement directly. This feature of Oracle greatly improves the execution performance of SQL and greatly saves memory usage. The key to using this feature is to put the executed statements into memory as much as possible, so this requires a large shared pool (by setting the shared buffer pool parameter value) and executing the SQL statement as much as possible using the binding variable.

When you submit an SQL statement to Oracle, Oracle first looks in shared memory for the same statement. It should be noted here that Oracle is a strict match between the two, to achieve sharing, the SQL statement must be exactly the same (including spaces, line wraps, etc.).

The following are steps to determine whether an SQL statement is the same as a SQL in shared memory:
1. Hashed the text string of the emitted statement. If the hash value is the same as the hash value of the SQL statement already in the shared pool, proceed to step 2nd:
2 the text string (including capitalization, whitespace, and annotation) of the emitted statement with all the identified in step 1th
An existing SQL statement is compared.
For example:
SELECT * from emp WHERE empno = 1000;
is different from each of the following
SELECT * from emp WHERE empno = 1000;
SELECT * from EMP WHERE empno = 1000;
SELECT * from emp WHERE empno = 2000;
In the above statement, the column values are in the direct SQL statement, and we will later make this kind of SQL hard-coded SQL or literal SQL

The binding variable (bind variables) with the same name must be used in the SQL statement that uses the binding variable.
For example:
A. The 2 SQL statements are considered to be the same
Select PIN, name from people where pin =: blk1.pin;
Select PIN, name from people where pin =: blk1.pin;
B. The 2 SQL statements are considered to be different
Select PIN, name from people where pin =: blk1.ot_ind;
Select PIN, name from people where pin =: blk1.ov_ind;
In the future, we'll call this kind of statement the binding variable SQL.

3. Compare the objects involved in the emitted statement with the objects involved in the existing statement identified in step 2nd.
For example:
If the user User1 and the user user2 have an EMP table, then
User User1 statement: SELECT * from EMP; And
User User2 statement: SELECT * from EMP; is considered to be a different statement,
Because the EMP referenced in two statements does not refer to the same table.

4. The bundle type of the bundle variable used in the SQL statement must be the same.

If the statement is equivalent to another statement currently in the shared pool, Oracle does not parse it. The direct execution of the statement improves execution efficiency, because parsing is more resource intensive.

Note that, starting with Oracle 8i, a new cursor_sharing parameter is introduced, the main purpose of which is to address the problem of hard-coded SQL that has been used heavily during programming. Because in actual development, many of the program staff to improve the speed of development, but similar to the following development methods:
Str_sql string;
Int_empno int;
Int_empno = 2000;
Str_sql = ' SELECT * from emp WHERE empno = ' + int_empno;
............
Int_empno = 1000;
Str_sql = ' SELECT * from emp WHERE empno = ' + int_empno;

The code above actually uses hard-coded SQL so that we can't use shared SQL, and the result is that the database is inefficient. However, from the above 2 statements, the resulting hard-coded SQL is only the column values are different, the other parts are the same, if only because the column value is different to cause these 2 statements can not be shared is a pity, in order to solve this problem, introduced the cursor_sharing parameter, This can also be done using shared SQL so that development can take advantage of shared SQL functionality. That sounds good, Oracle is really for the sake of the user, so that users can take advantage of the shared SQL functionality without changing the code. Is it really so? Heaven doesn't drop a pie for no reason. Oracle describes the use of this parameter, and recommends that you change the value of the parameter after the actual test (by default, the value of the parameter is exact and the statement is fully consistent to use shared SQL). Because it is possible to change this value, your hard-coded SQL can use shared SQL, but the performance of the database will decline. I have encountered this situation in practical applications. Therefore, it is advisable to write a developer who needs to run the program stably to use the SQL of the binding variable at the outset.

The concept of rowID:

rowID is a pseudo column, since it is a pseudo column, then the column is not user-defined, but the system itself is added. There is a rowid pseudo column for each table, but the value of the ROWID column is not physically stored in the table. However, you can use it as you would any other column, but you cannot delete the column or modify or insert the value of the column. Once a row of data is inserted into the database, ROWID is unique within the life cycle of the row, that is, even if the row produces a row migration, the rowid of the row does not change.

Why use rowID

ROWID provides the quickest access to a given row in a table, which can be positioned directly over the corresponding block of data by ROWID and then read to memory. When we create an index, the index not only stores the value of the indexed column, but also stores the rowid of the row corresponding to the index value, so that we can quickly find the rowid of the corresponding row through the index, through which we quickly query the data. This is why we use the index query faster.

In previous versions of ORACLE8, ROWID consisted of file, block, ROW number. With the extension of the concept of object in Oracle8, ROWID has changed, ROWID is composed of object, FILE, block, ROW number. The ROWID can be decomposed into the above parts by using Dbms_rowid, and the above parts can be formed into an effective rowid.

Recursive SQL Concepts

Sometimes in order to execute an SQL statement issued by a user, Oracle must execute some additional statements, which we call ' recursive calls ' or ' recursive SQL statements '. When a DDL statement is issued, Oracle always implicitly emits some recursive SQL statements to modify the data dictionary information so that the user can successfully execute the DDL statement. Recursive calls often occurs when the required data dictionary information is not in shared memory, and these recursive calls read the data dictionary information from the hard disk into memory. Users do not care about the execution of these recursive SQL statements, and Oracle automatically executes the statements internally when needed. Of course, both DML statements and select can cause recursive SQL. Simply put, we can think of triggers as recursive SQL.

Row Source (row sources)

Used in queries, a set of eligible rows returned by the previous operation, which can be a collection of all row data for a table, or a collection of partial row data for a table, or a collection of row data that can be obtained after a join operation on the top 2 row source, such as a join connection.

predicate (predicate)

Where to restrict conditions in a query

Driving Table (driver)

The table is also called the outer table (OUTER table). This concept is used in nesting and hash joins. If the row source returns more row data, it has a negative effect on all subsequent operations. Note that although translated as a driver table, it is actually more precise to translate to a driver row source (driving row sources). In general, the table that returns fewer row sources is used as the driving table after applying the constraints of the query, so if a large table has a constraint on the Where condition (such as an equivalent limit), then the large table is also appropriate as a driver, so it is not just a small table that can be used as a driver The correct argument should be the table that returns fewer row sources after applying the constraint criteria for the query. In the execution plan, it should be the upper row source, followed by a specific description. In our description later, the table is generally referred to as row Source 1 of the join operation.

probed table (being probed)

The table is also called an inner table (INNER table). After we get a specific row of data from the driver table, we look for the row in the table that matches the join condition. So the table should be a large table (it should actually be a table that returns a larger row source) and should have an index on the corresponding column. In our description later, the table is generally referred to as row Source 2 of the join operation.

composite Index (concatenated index)

Indexes consisting of multiple columns, such as the CREATE Index idx_emp on EMP (col1, col2, col3, ...), then we call the Idx_emp index as a combined index. There is an important concept in a composite index: the boot column (leading column), in the example above, where col1 is listed as the bootstrap column. When we make a query, we can use "where col1 =?" ", you can also use the WHERE col1 =?" and col2 =?, such restrictions will use the index, but "where col2 =?" The index is not used by the query. The restricted condition does not use the combined index when it contains a pilot column.

Optional (selectivity):

You can determine the selectivity of a column by comparing the number of unique keys in the following and the number of rows in the table. If the ratio of the number of unique keys/table rows in the column is closer to 1, the higher the selectivity of the column, the more appropriate the column is to create the index, and the higher the selectivity of the index. When a query is made on a column that is optional, it returns fewer data and is better suited to using an indexed query.


With these background knowledge, you will begin to introduce the implementation plan. To execute a statement, Oracle may have to implement many steps. Each step in these steps may be to physically retrieve rows of data from a database, or to prepare rows of data in some way for use by users who emit statements. The combination of these steps that Oracle uses to execute statements is called the execution plan. The execution plan is the most complex and critical part of SQL optimization, and it is only when we know how Oracle executes the SQL statement internally that we know whether the optimizer chooses an execution plan that is optimal. The execution plan is as important to the DBA as the financial statement is to the financial staff. So the main problem we face is how to get the execution plan and how to analyze the execution plan to find out the main problem that affects performance. The following is an introduction to the Analysis tree execution plan, then describes how to get the execution plan, and then describes how to analyze the execution plan.

Example: This example shows the execution plan for the following SQL statement.
SELECT ename, Job, Sal, dname
From EMP, dept
WHERE Emp.deptno = Derpt.deptno
And not EXISTS
(SELECT *
From Salgrade
WHERE emp.sal BETWEEN losal and Hisal);

This statement queries the name, job, salary, and department name of all employees whose salary is not within the recommended salary range. Figure 5-1 below shows a graphical representation of an execution plan:

steps to execute a plan

Each step in the execution plan returns a set of rows that are either used for the next step or returned to the user or application that issued the SQL statement at the final stage. A set of rows returned by each step is called a row source (row sources). Fig. 5-1 The tree chart shows the flow of data from one step to another. The number of each step reflects the order in which you observe the steps shown in the execution plan (how to observe the execution plan will be briefly described). In general, this is not the order in which each step is executed. Perform each step of the plan or retrieve rows from the database, or receive row data from one or more row sources as input: the steps indicated by the Red Word box physically retrieve data from the data file in the database. This step is called an access path, followed by a detailed description of the access paths available to Oracle:
L steps 3rd and 6th respectively read all rows from the EMP table and the Salgrade table.
L Step 5th finds each DEPTNO value returned by step 3 in the Pk_deptno index. It finds the rowid of those rows associated with the Dept table.
The 4th step retrieves from the Dept table those rows returned by the ROWID to step 5th.
The steps indicated by the Black Word box operate on the line source, such as the association between 2 tables, sorting, or filtering, and the following will give a detailed description:
The 2nd step implements nested loops (equivalent to nested loops in a C statement), receives the row source from steps 3rd and 4th, joins each line from the 3rd step source with the corresponding row in its 4th step, and returns the result line to the 1th step.
L Step 1th completes a filter operation. It receives the row sources from steps 2nd and 6th, eliminates the rows in step 2nd that have rows in the 6th step, and returns the remaining rows from the 2nd step to the user or application that issued the statement.

the order in which the execution plan steps are implemented

The steps in the execution plan are not implemented in the order in which they are numbered: Oracle first implements the steps (such as steps 3, 5, 6) that appear as leaves in the tree structure graph of Figure 5-1. The row returned by each step is called the row source for the next step. Oracle then implements the parent step.

For example, to perform the statements in Figure 5-1, Oracle implements these steps in the following order:
L First, Oracle implements step 3 and returns the resulting row to step 2nd in one line.
L for each row returned in step 3rd, Oracle implements these steps:
-Oracle implements Step 5 and returns the result rowID to step 4th.
Oracle implements step 4 and returns the resulting row to step 2nd.
Oracle Implementation Step 2 will accept a row from step 3rd and a row from step 4th, and return to the 1th step line.
Oracle implements step 6 and returns it to step 1th if a result line is available.
Oracle implements Step 1, and if you return rows from step 6, Oracle returns the row from step 2nd to the user who issued the SQL statement.

Note that Oracle implements steps 5,4,2,6 once for each row returned by step 3rd. Many parent steps require only a single line from their child steps before they can execute. For such a parent step, the parent step (and possibly the remainder of the execution plan) is implemented as soon as a single row is returned from the child step. If the parent step of the parent step can also return activation through a single line, it is also executed. Therefore, execution can be concatenated in the tree and may contain the remainder of the execution plan. For such an operation, you can use First_rows as an optimization goal to enable you to quickly respond to a user's request.
For each row that is retrieved sequentially for each child step, Oracle implements the parent step and all the steps that are concatenated together. The parent steps that are triggered for each row returned by the child step include table access, indexed access, nested loops joins, and filters.

Some parent steps need all the rows from the child steps before they are implemented. For such a parent step, Oracle cannot implement the parent step until all rows are returned from the child step. Such parent steps include sorting, sorting a merged connection, group functionality, and grand totals. For such an operation, you cannot use First_rows as an optimization goal, but you can use all_rows as an optimization objective to minimize the amount of resources that are consumed by this type of operation.

Sometimes the statement execution, not as mentioned above, step-by-step with the first, but may run in parallel, such as in the actual environment, 3, 5, 4 steps may run in parallel in order to achieve better efficiency. From the tree diagram above, it is difficult to see the order in which each operation is performed, and another form of execution plan that is generated by Oracle makes it easy to see which operation is executed first and which is executed, and the execution plan is what we really need, followed by a detailed explanation. Now let's look at some preliminary knowledge.

Access Path (method)--Access path

An important choice that the optimizer needs to make when it comes to executing the plan is how to query the data from the database. For any row in any table accessed by an SQL statement, there may be many access paths (accessor methods) through which you can locate and query the data that you want. The optimizer chooses the path that it considers itself to be optimized.

At the physical level, Oracle reads the data, the smallest unit of read at a time is the database block (consisting of multiple contiguous operating system blocks), and the maximum read at a time is determined by the maximum value of the operating system I/O to the Multiblock parameter, so even if only one row of data is required, It also reads the database block where the row is located into memory. Logically, Oracle accesses data using the following access methods:

1 full table scan (Scans, FTS)

To implement a full table scan, Oracle reads all the rows in the table and checks to see if each row satisfies the where constraint of the statement. Oracle sequentially reads each chunk of data allocated to the table until it is read to the highest watermark (high water mark, HWM, identifies the last block of data in the table). A multi-block read operation allows one I/O to read multiple blocks of data (db_block_multiblock_read_count parameter settings), rather than reading only one block of data, which greatly reduces I/O total number of times and improves system throughput. Therefore, the use of multiple-block reading method can be very efficient implementation of the full table scan, and only in the case of full table scan to use the multiple-block read operation. In this access mode, each block of data is read only once. Since the HWM identifies the last piece of data that is read, and the delete operation does not affect the HWM value, all the data for a table is removed and the time for the full table scan does not improve, and we typically use the truncate command to make the HWM value 0. Fortunately, Oracle 10G can artificially shrink the value of HWM.

Data that is read in FTS mode is placed at the end of the cached least recently Used (LRU) list, which allows it to quickly swap out memory so that memory-critical data is not swapped out of memory. Prerequisites for using FTS: full-table scans are not recommended on larger tables unless the data is fetched more than 5%-10% of the total, or you want to use the parallel query feature.
Examples of using full table scans:
~~~~~~~~~~~~~~~~~~~~~~~~
Sql> explain plan for SELECT * from dual;
Query Plan
-----------------------------------------
SELECT STATEMENT [CHOOSE] cost=
TABLE ACCESS Full DUAL

2 via ROWID table access (table access by ROWID or rowID lookup)

The rowid of the row indicates the data file, block, and position of the row in the block, so access to the data via ROWID can be quickly positioned on the target data and is the fastest way for Oracle to access the single line of data. In order to access the table via ROWID, Oracle first obtains the ROWID of the selected row, either from the statement's WHERE clause or through the index scan of one or more indexes of the table. Oracle then locates each selected row based on the resulting rowid.

This access method does not use multiple read operations, and a single I/O can read only one block of data. We will often see this access method in the execution plan, such as querying the data through the index.

Ways to use ROWID access:
Sql> explain plan for SELECT * FROM dept where rowid = ' AAAAYGAADAAAAATAAF ';
Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS by ROWID DEPT [analyzed]

3 Index Scan (Index scan or index lookup)

We first find the ROWID value of the data by index (multiple ROWID values may be returned for a non-unique index), and then get concrete data directly from the table based on rowID, which is called Index Scan or index lookup. A rowid uniquely represents a row of data that corresponds to a block of data that is obtained through an I/O, in which case the secondary I/O reads only one database block.

In the index, in addition to storing the values for each index, the index stores the ROWID value for the row with this value. An index scan can consist of 2 steps: (1) scan the index to get the corresponding ROWID value. (2) Read the specific data from the table by finding the ROWID. Each step is a separate I/O, but for the index, due to frequent use, most of the cache is already in memory, so the 1th step I/O is often logical I/O, that data can be obtained from memory. But for step 2nd, if the table is large, its data cannot be all in memory, so its I/O is most likely physical I/O, which is a mechanical operation, which is extremely time-consuming relative to logical I/O. So if you have an index scan for a large table, and if you take out more than 5%-10% of the total, using an index scan can be a lot less efficient.
As shown in the following:
Sql> explain plan for select Empno, ename from EMP where empno=10;
Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS by ROWID EMP [analyzed]
INDEX UNIQUE SCAN EMP_I1

Note that table access by ROWID the EMP section, which indicates that it is not accessing data through the FTS access path, but rather accessing the data through the ROWID lookup access path. In this case, the desired rowid is due to the Empno column found in the index, this way is the index UNIQUE scan lookup, followed by an introduction, EMP_I1 the index name used for the index lookup.

However, if the query data can be found all in the index, you can avoid the 2nd step, avoid unnecessary I/O, at this time even through the index scan to remove more data, the efficiency is very high, because this will only read in the index. So the above I'm introducing the Rule-based optimizer using select COUNT (ID) from Swd_billdetail where CN < ' 6 ' instead of using select COUNT (CN) from Swd_billdetail where CN < ' 6 '. Because in practice, querying only the values of indexed columns is extremely rare, so if I use COUNT (CN) in a query, it is not representative.

Sql> explain plan for select empno from EMP where empno=10; --Query Empno column values only
Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
INDEX UNIQUE SCAN EMP_I1

Further, if you sort the indexed columns in an SQL statement, because the indexes are sorted well, you do not need to sort the indexed columns in the execution plan
Sql> explain plan for select Empno, ename from EMP
where Empno > 7876 order by empno;
Query Plan
--------------------------------------------------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS by ROWID EMP [analyzed]
INDEX RANGE SCAN emp_i1 [analyzed]

As you can see from this example: Because the indexes are already sorted, the rows that meet the criteria are queried in the order of the indexes, thus avoiding further sorting operations.

[1] [2] Next page



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.