Today, we will focus on the correct solution for using distribution statistics in the DB2 optimizer. We all know that the DB2 database optimizer is a cost-based optimizer. It mainly makes decisions based on the statistical information of tables and indexes.
DB2 not only provides basic statistics, but also allows the creation of so-called distribution statistics. This article explains the situations where distribution statistics and distribution statistics are particularly important and what application developers should consider in order for the DB2 optimizer to create an effective access plan.
Introduction
To execute queries or DML statements INSERT, UPDATE, and DELETE), DB2 must create an access plan ). The access plan defines the order in which tables are accessed, the indexes used, and the join method used to associate data. A good access plan is crucial for quick execution of SQL statements. The DB2 optimizer can create an access plan. This is a cost-based optimizer, which means that it makes decisions based on the statistical information of tables and indexes.
DB2 not only provides basic statistics, but also allows the creation of so-called distribution statistics. Not only does the database administrator need to understand the distribution statistics, but also application developers need to understand the distribution statistics. Application developers must be careful because distribution statistics are important to the DB2 optimizer in some cases.
The use of primary variables or parameter tags in Java is java. SQL. PreparedStatement) may cause obstacles, so that the optimizer cannot take advantage of distribution statistics to the maximum extent. This article explains the situations where distribution statistics and distribution statistics are particularly important and what application developers should consider in order for the DB2 optimizer to create an effective access plan.
Basic statistics and distribution statistics
Before studying the distribution statistics, let's take a look at the basic statistics. You only need to execute RUNSTATS to collect statistics about these tables.
Table statistics:
Currently used page count
Number of pages containing record rows
Number of overflow rows
Base number of rows in the table)
For MDC tables, there are also blocks containing data.
Related Statistics of each column in the table:
Column Base
Average Column Length
The second largest value in the column.
The second small value in the column.
Number of NULL values in the column
Generally, when running RUNSTATS, you can collect not only statistics about the table, but also related statistics about the corresponding index. For more information about the statistics collected for indexing, see DB2 Administration Guide: Performance-Statistical information that is collected.