Starting with the IBM I V5R3 version, the SQL query engine (SQE) has begun to support a powerful strategy that minimizes input/output (I/O) and maximizes query performance. The new strategy, like magic, allows the optimizer to generate predicates without predicate conditions and rewrite the query. This paper discusses the prospective predicate generation technology (LOOK-AHEAD predicate generation) and the benefits it brings.
Referring to query optimization, minimizing or even eliminating the reading and processing of invalid data is key to improving query performance, because I/O operations are relatively slow operations. The task of the query optimizer is to choose the right data access method, establish the strategy of accessing and processing data as fast and efficiently as possible, and if there are a large number of strategies to choose from, the optimizer can usually choose and establish the appropriate access plan to meet the user's response time requirements.
In a query, the local selection predicate is used to specify which rows need to be selected for processing. One way to exclude data, which will no longer be used, is to read the table's data and test the values. Another technique is not to read table data, but to rely on mathematical principles and other database objects to avoid testing rows in a table. For example, there is a query request with a local selection condition, the local selection condition selects one row of data from the 100 million rows, the database engine can either complete the query by reading and testing each row of data (100 million tests), or use the index to read the qualifying row of data, which eliminates the need to read and test each row of data. Although the results of these two methods of execution are the same, their performance differences are significant.
Another query scenario that can cause a lot of read (poor performance) operations is a table connection. A connection typically causes the database engine to read data from one table and then perform a read operation in another table to find all the matching rows. By definition, a connection can reduce the result set, but this requires testing each row of data. These read operations and the eventual mismatch of large amounts of data can cause serious performance problems for such queries. If the optimizer is able to use certain policies to eliminate rows before the connection, query performance can be greatly improved, and the connection will require much less system resources.
DB2 for I supports many methods and policies to reduce the number of rows and processing data that need to be read, one of which is the forward-looking predicate generation technology (LOOK-AHEAD predicate generation), a powerful tool that can greatly reduce query time.
Here is a simple connection case that explains why LPG can have such a positive impact on query performance.
Two connection methods
Imagine two tables: Small table A is relatively small, large table B is relatively large. A query request involves both tables, there is a local selection condition on small table A, there is no local selection on large table B, and the local selection of table A is high (selectivity) (for example, it determines the two rows on a small table a).
SELECT *
From Smalltablea A,
Largetableb B
WHERE A.join_col = B.join_col
and A.col1 in (112358, 132134)
If there is no index on big table B, you will have several ways to access it and a combination of connections:
One strategy is to use a large table B to connect small table A, a full table scan of large table B, and a table for each row of table B to be connected to a tables accessed through an index or hash table, as shown in Figure 1:
Figure 1: Large table B connection small Table A
Because there are no local selection conditions on large table B, each row selected from large table B is used to connect to the small table A, which produces a large number of read operations.
Another strategy is to use small table A to connect to large table B, which typically scans the entire table for a temporary index or a temporary hash table that is generated by large table B, as shown in Figure 2: