The following articles mainly describe the ultimate SQL Performance tuning Technology of DB2, including the impact of pointers on the Performance of DB2 databases, DB2 performance tuning technology and a detailed description of more future tuning technologies, the following is the main content of the article.
DB2, SQL, tuning
Use the correct performance tuning Technology for workloads to avoid hardware upgrades and optimization of DB2 Performance
Performance is measured by response time, throughput, peak response time, hit, and session per second. SQL coding and adjustment technologies directly affect performance. Developing High-Performance DB2 applications requires an in-depth understanding of DB2 technology.
Of course, these technologies are insignificant when there is a small amount of data. Ignore connections, subqueries, table expressions, and CASE expressions can work well under lightweight loads. The program that uses the select info Statement of 100% to obtain data quickly at the beginning.
However, once the data volume and session speed increase, the performance will be greatly affected. The scalability of DB2 needs to be small. The optimized SQL plus solution design, performance structure, buffer pool, and storage optimized for the workload mode are required. The other solution is to upgrade the hardware. Of course, you don't need to read this article for those who have an endless budget for hardware upgrades. For others, I will explain how to code smart SQL statements and optimize access paths.
Impact of pointers on DB2 Performance
For a while, a batch processing program exists in a large and complex banking application. This new batch processing program and access path have been checked by code lookup. There are few tests due to the project deadline; in actual first run, the program stops after 10 hours of running.
After a very slow code query, seven pointers are found, each of which accesses data in a different table. Each pointer is opened in another open pointer loop, and data is transmitted between each other. That is to say, this program combines seven tables out of DB2. This is not a smart SQL statement. This information needs to be entered into seven tables; however, each pointer can only enter one. Therefore, the seven pointers are merged into a smart pointer:
- SELECT COL1, COL2, rest of the columns
- FROM ADDR A, NAME N, T3, T4, T5, T6, T7
- WHERE A.COL1 = N.COL9
- AND N.COL9 = T3.COL3
- AND T3.COL3 = T4.COL4
- AND T4.COL4 <> T5.COL5
- AND T4.COLX <> T5.COLY
- AND T5.COL6 = T6.COL6
- AND T6.COL6 = T7.COL7
- AND T6.CODE = :hv
The batch processing took about four minutes the next day. Most people may end this successful task, but pragmatic people will not. A slow EXPLAIN query found an interesting table connection sequence problem. The optimizer selects the complex circular join for the first seven tables and uses a series of large data tables (ADDR and NAME), each containing 10 million rows of data. This is not a typical behavior of the DB2 optimizer. However, some use <> small table column connections.
These comparisons are difficult for the optimizer to estimate because DB2 catalog contains equal columns rather than unequal columns. Here we need to optimize the access path. There must be a variety of recommended solutions in the minds of DB2 optimizers, some of which can be at the package or statement level, and others work at the predicate level. Of course there are other ultimate DB2 technologies that do not work in traditional ways.
One requirement is that the following performance tuning technology provides sufficient statistics for your catalog. You can use the statistics Wizard to ensure that the optimizer has a precise panorama of your data.
DB2 performance tuning Technology
Package-level SQL Optimization-The REOPT (ONCE/ALWAYS/AUTO) BIND option is required. This statement advertises the optimizer to re-optimize each statement in the package at runtime, at least ONCE, or ALWAYS (each execution). in DB2 9, you can use AUTO (when needed ). The overhead of this technology is determined by the number and complexity of the selected options and SQL statements. These overhead can be ignored in the batch processing program, but it will have a great impact in the short-term transactions. In our example, the batch processing program pointer has only one predicate and one host variable with a base number of 1. The REOPT option is used to optimize the distribution of non-uniform column values and the highly variable content of host variables. It is the opposite of COLCARDF = 1. Package-level adjustment is not appropriate.
Statement-level adjustment technology-including optimize for n ROWS and fetch first n rows only. These statements are placed at the end of the SELECT statement and optimized without the result set. The optimizer assumes that all SELECT statements except these statements require the entire result, which is biased towards access paths such as number order and table prefetch. Because our batch processing pointer must have the entire result, the statement-level adjustment is not an appropriate technology.
Adjustment Technology for the predicate sector-including adding a false filter (TX. CX = TX. CX) or add an empty operation to the predicate (+ 0,-0,/1, * 1, CONCAT ''). A false filter can change the optimizer by reducing the total filter factor (the ratio of qualified rows in the table. This method can change the table connection sequence, index selection, and connection method. Multiple false filters are allowed, but must be in a column that has not been referenced.
The no op operation can change the optimizer's working mode by downgrading a filter from conformity to non-conformity, but it is only useful on z/OS, the LUW optimizer is not affected. This change will also affect the connection sequence, index selection and Connection Methods of a table. Predicate-level technologies can be used together to obtain desired results. The pointer in our example does not respond to the combination of multiple predicates, so it is time to use heavy weapons.
Some final tuning techniques include the expressions of tables using DISTINCE and other block optimization methods of DB2 ultimate cross-query. These technologies require manual query and rewriting. They force the optimizer to execute query blocks in a specified order. Using these technologies is the ultimate reminder as needed, because they can change the table connection sequence, index selection, and connection methods from good to bad. The DISTINCE table expression force optimizer takes precedence over other query blocks and executes the query in parentheses.
If the columns specified in select distince reference different tables, the table expression can be instantiated as unique for sorting. Our batch processing pointer has a non-optimized connection sequence, which is used to obtain the following query:
- SELECT All columns needed FROM ADDR, NAME, (SELECT DISTINCT columns from tables 3 through 7
- FROM T3, T4, T5, T6, T7
- WHERE join conditions T3 through T7
- AND T6.CODE =:hv) AS TEMP
- WHERE join conditions for ADDR, NAME and TEMP
Such a query rewrite forces the optimizer to connect the ADDR and NAME through the T7 connection table T3. If the keyword DISTINCT is omitted in the preceding example, the DB2 optimizer combines the table expression query and output query, which is the same as the original statement and connection sequence.
Select distinct is a key component. However, because the column list spans multiple tables, the temporary five table join result instances are a unique working file for sorting. The sorting overhead executes thousands of rows on average each time, which is negligible. The batch processing program can now complete the task within two minutes.
More future adjustment Technologies
Other Query Rewriting Techniques obtain information from all different query blocks to rewrite the query. IBM has made this technology cross-query block optimization; DB2 9 has become a global optimization. The good news is that this technology began to appear in the QWR stage of the DB2 optimizer. It is just around the corner for all DB2 queries. At the same time, we also need to master some ultimate DB2 methods in our own hands.