How to optimize Oracle SQL query by parsing ORACLE data and improve the performance of local range data scanning

Source: Internet
Author: User

Oracle SQL query optimization for Oracle Data Scanning

(2)

2. How to improve the performance of local range data scanning:

2.1 category and role of query conditions:

When we initiate a SQL statement with conditions, multiple query conditions are usually assigned to the statement. Among these query conditions, one or more query conditions are usually used as the initiators of data retrieval.Driver query Conditions"; At the same time, the query conditions that are not used as the data retrieval initiator can only filter data within the data range determined by the driving query conditions throughout the Data Query Process, these query conditions are called"Filter query Conditions".

It can be seen that the query conditions and filter query conditions are driven. One serves as the query initiator and the other serves as the data filter. The two operate on different target datasets, the query condition is used to query the entire dataset, while the filter query condition is used to query the result set obtained by the driver query condition. It can be seen that the query condition driver plays a vital role in the execution performance of the final query statement. Because they determine the maximum data range of the final query results and whether they can obtain the data within this range as quickly as possible, the filter query condition only filters data from the data range determined by the driver query condition. Of course, we cannot ignore the impact of data filtering by filtering query conditions on query performance, but it does not play a major role in the final performance.

The query condition is usually determined by the execution optimizer and the optimizer running mode (first_rows or all_rows. When determining the driving query conditions, Oracle usually considers using indexed columns as the driving query conditions, in general, Oracle evaluates whether a query condition has the primary key index, unique index, B * tree or other types of indexes, and whether it is a hand field that uses a compound index with equivalent conditions, at the same time, the cost efficiency of various indexes will be evaluated through the cost estimation method based on the optimizer mode, and the most efficient indexes will be identified and used as the driving query condition. If the query optimizer finds multiple efficient indexes, the optimizer uses its own policies to determine the driver query conditions for the optimal execution cost. Meanwhile, the query optimizer identifies, which of the following indexes are used for filtering conditions? although they are not so efficient, they can be very helpful for data retrieval driven by query conditions, the optimizer will merge these indexes and drive query conditions to achieve the best query performance.

However, when the query optimizer is used to determine the query conditions, it is usually okay, but sometimes the data distribution and statistics on data objects are too old, we have to intervene in the decision-making process of the optimizer manually because it is inconsistent with the expected data scan method. The most common intervention method is hints. This also shows that today's artificial intelligence cannot replace real human intelligence.

2.2 principles for improving the performance of the range scan:

With the knowledge of the above query condition categories, we can discuss the principle of improving the performance of local range scanning of data. Even if you think about it from common sense, you can know that the execution speed of data query from a small data set is more efficient, and data query from a large range of data sets is more inefficient. However, once the interaction between the above-mentioned search conditions and filter query conditions is integrated, it will be discovered during local range scanning, the output speed of data query in a large range is higher than that of data query in a small range. Why is this opposite to our common sense? The following example is used to illustrate this problem.

Here is a common SQL query statement.

Select * from order where ordno between 1and 1000 and custno like 'tb % ';

Assume that indexes are built on the ordno and custno fields. At the same time, we know that there are 1000 rows of data records meeting the ordno query conditions, and 10 rows of data records meeting the custno conditions. Obviously, because all query conditions have indexes and no operations such as sorting, grouping, and summary are required, this statement uses local range data scanning, and the following execution path may appear:

(1) Use ordno as the driving query condition, and custno as the filter query condition:

In this case, the query uses the ordno index to scan data rows that meet the condition, and uses the custno filter condition to filter data rows that meet the condition, then, the filtered data rows are filled into the batch array and returned as the query result. But the problem is met.Ordnobetween 1 and 1000The number of record rows with the condition is 1000.Custnolike 'tb %'If there are only 10 rows of records in the condition, use the data with only 10 rows of records to filter data with 1000 rows of records and fill in the batch array containing the final result, it can be imagined that the batch array is difficult to be quickly filled, that is, the query results are difficult to be quickly returned, in the worst case, you may need to scan the entire data range (1000 records) determined by the drive query conditions from start to end before the query can end.

(2) Use custno as the driving query condition and ordno as the filter query condition:

In this case, the query uses the custno index to scan the rows that meet the condition, and uses the ordno filter condition to filter the rows that meet the condition, then, the filtered data rows are filled into the batch array and returned as the query result. In this caseOrdnobetween 1 and 1000If there are 1000 rows in the record row with the condition, it is easy to find that the matching conditions can be met.Custnolike 'tb %'Therefore, the batch array can be easily filled with 10 rows of records, even in the worst case, it will only scan to meetCustnolike
'Tb %'
Therefore, the query results can be quickly returned.

To improve the query performance, we can exchange the query condition role, that is, exchange the driver query conditions and filter the query conditions. Therefore, we can use the following statement to rewrite the preceding query:

Select/* + index (Order custno_inx )*/*

From order where ordnobetween 1 and 1000 and custno like 'tb % ';

We use hints to influence the optimizer behavior, so that the optimizer can work outCustnoThe index on the field to initiate a query.Custnolike 'tb %'To change the filter query conditionOrdno between 1 and 1000. Hints is used here. In fact, some measures can be used to make the ordno index invalid, which can also achieve the same purpose. For example, you can use a function on the ordno field that will not affect the query result. For exampleRtrim (ordno) between1
& 1000,
However, this method has drawbacks. It may affect the optimizer to develop an execution plan that combines custno and ordno for data query.

In this example, we can see that there are two policies to improve the performance of local range scanning:

Reduce the data scope that drives the query conditions to reduce the data scanning volume;

Expand the data range of filtering query conditions as much as possible to increase the probability of filling up the batch array and returning query results quickly;

Therefore, we can obtain the following description matrix of the local range scan performance improvement policy:

Data range that meets the driver query Conditions

Data range that meets the filter query Conditions

Performance

Measures

Small

Small

High

 

Small

Large

High

 

Large

Small

Low

Role for exchanging filter conditions

Large

Large

High

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.