The magical forward-looking predicate generation technique

Source: Internet
Author: User
Tags hash one table

Starting with the IBM I V5R3 version, the SQL query engine (SQE) has begun to support a powerful strategy that minimizes input/output (I/O) and maximizes query performance. The new strategy, like magic, allows the optimizer to generate predicates without predicate conditions and rewrite the query. This paper discusses the prospective predicate generation technology (LOOK-AHEAD predicate generation) and the benefits it brings.

Referring to query optimization, minimizing or even eliminating the reading and processing of invalid data is key to improving query performance, because I/O operations are relatively slow operations. The task of the query optimizer is to choose the right data access method, establish the strategy of accessing and processing data as fast and efficiently as possible, and if there are a large number of strategies to choose from, the optimizer can usually choose and establish the appropriate access plan to meet the user's response time requirements.

In a query, the local selection predicate is used to specify which rows need to be selected for processing. One way to exclude data, which will no longer be used, is to read the table's data and test the values. Another technique is not to read table data, but to rely on mathematical principles and other database objects to avoid testing rows in a table. For example, there is a query request with a local selection condition, the local selection condition selects one row of data from the 100 million rows, the database engine can either complete the query by reading and testing each row of data (100 million tests), or use the index to read the qualifying row of data, which eliminates the need to read and test each row of data. Although the results of these two methods of execution are the same, their performance differences are significant.

Another query scenario that can cause a lot of read (poor performance) operations is a table connection. A connection typically causes the database engine to read data from one table and then perform a read operation in another table to find all the matching rows. By definition, a connection can reduce the result set, but this requires testing each row of data. These read operations and the eventual mismatch of large amounts of data can cause serious performance problems for such queries. If the optimizer is able to use certain policies to eliminate rows before the connection, query performance can be greatly improved, and the connection will require much less system resources.

DB2 for I supports many methods and policies to reduce the number of rows and processing data that need to be read, one of which is the forward-looking predicate generation technology (LOOK-AHEAD predicate generation), a powerful tool that can greatly reduce query time.

Here is a simple connection case that explains why LPG can have such a positive impact on query performance.

Two connection methods

Imagine two tables: Small table A is relatively small, large table B is relatively large. A query request involves both tables, there is a local selection condition on small table A, there is no local selection on large table B, and the local selection of table A is high (selectivity) (for example, it determines the two rows on a small table a).

SELECT *

From Smalltablea A,

Largetableb B

WHERE A.join_col = B.join_col

and A.col1 in (112358, 132134)

If there is no index on big table B, you will have several ways to access it and a combination of connections:

One strategy is to use a large table B to connect small table A, a full table scan of large table B, and a table for each row of table B to be connected to a tables accessed through an index or hash table, as shown in Figure 1:

Figure 1: Large table B connection small Table A

Because there are no local selection conditions on large table B, each row selected from large table B is used to connect to the small table A, which produces a large number of read operations.

Another strategy is to use small table A to connect to large table B, which typically scans the entire table for a temporary index or a temporary hash table that is generated by large table B, as shown in Figure 2:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.