Experience in using optimizer updates to execute plans in DB2 databases (1)

Source: Internet
Author: User

The following article describes how to use optimizer updates to execute plans in DB2 databases, we all know that in a DB2 database, the optimizer is generally used to analyze SQL statements and generate an Access Plan that is considered optimal ).

The DB2 optimizer is actually a set of standard rules. Generally, we only need to tell DB2 what to search, not how to search.

So what is the DB2 optimizer used to determine the optimal SQL access path?

DB2 optimizer is a Cost-Based optimizer, namely, CBO (Cost Based Optmizer ). That is to say, the DB2 optimizer applies the query cost formula, which evaluates and balances the four factors of each possible access path: CPU cost, I/O cost, statistical information in the DB2 system directory, and actual SQL statements.

Let's take a brief look at the workflow of the DB2 optimizer:

1. After receiving the SQL statement, the DB2 optimizer first checks the SQL syntax to ensure that the SQL statement is correct.

2. Generate the optimal execution plan based on the current system environment information to optimize the SQL statement.

3. Translate the SQL statement into a computer instruction language and execute the optimized SQL statement.

4. Return results or store them for future execution

In our opinion, the statistical information in the DB2 system directory is a very important basis for the DB2 optimizer to work correctly. These statistics provide the optimizer with information about the table status to be accessed by the optimized SQL statement. The information mainly includes:

Table: includes the number of records, PAGE, PCTFREE, and COMPRESS of the Table. The related system views are: sysstat. tables and syscat. tables.

Columns-including the number, length, distribution characteristics, and COMPRESS of COLUMNS. The system views are: sysstat. columns, syscat. columns

Index-including the existence of indexes, the number of leaf pages and the number of levels of the indexed organization), the number of discrete values of the Index key, and whether the Index is clustered. The system view is sysstat. indexes, syscat. indexes

Others include partition/node group information and table space information.

How can we update this information in a timely manner? To ensure that the DB2 optimizer works properly, the following methods are provided in DB2.

1. RUNSTATS and REOGCHK

The Runstats command is mainly used to collect the state information of DB2 database objects, which is crucial for the database to use a reasonable access plan. In general, we need to use runstats to collect statistics in the following situations:

1. After creating an index for the table, we 'd better perform runstat once. This situation is often ignored. Most of the time, after adding an index to the table, you can analyze the execution plan and find that there is no change, which is strange. In fact, you need to perform a runstats operation at this time. In MySQL 8.2, DB2 has made great improvements to avoid this problem. You can update your information immediately when creating an index.

2. After performing a reorg operation on the table, remember to perform a runstats operation. Because reorg is performed on a table, a lot of information about the table is modified, such as the high level. Therefore, a runstats operation can update the statistical information.

3. When the data in your table changes a lot, in general, about 10%-20% of the data in the table has changed, you should perform a runstats operation. These changes include deletion, modification, and insertion. For some very large tables, for example, in the data warehouse project, some fact tables are very huge.

At this time, it may take a considerable amount of time to complete a large table runstats. in DB2 8.1, We can sample these large tables, for example, only 20% of the data is runstats, in general, it is also possible to get the correct execution plan. Make sure that the data in this table is evenly distributed.

4. When you use the redistribute database partition group command in the PARTITION (DPF) DATABASE, you need to use runstats to collect new statistics.

The RUNSTATS command syntax is as follows:

If the table name is DB2INST1. STAFF and the table has an index, you can use the following example to complete the RUNSTATS command:

 
 
  1. db2 runstats on table db2inst1.staff with distribution and detailed indexes all 

In actual projects, for tables with relatively large changes, we need to regularly perform runstats for the DB2 database. In general, runstats and reorg can be combined. First, reorg is the table, and then runstats, finally, the REBIND database generates an appropriate statistical plan based on the latest statistics.

It is worth noting that if the table data volume to be processed changes rapidly, for example, in the China Telecom Mobile Industry, the summary table needs to be processed at the end of the month. The amount of data changes greatly within a short period of time, which makes the statistics obtained by RUNSTATS inaccurate because these statistics are only information at a certain time point. You can use this statement to change the table to volatile.

 
 
  1. alter table table_name volatile cardinality 

In this way, the optimizer will consider using index scanning instead of table scanning. Regardless of the statistics, the optimizer uses index scanning instead of table scanning.

The IBM document also provides the REORGCHK command to calculate whether the table needs to be reorganized based on the statistical formula.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.