Take full advantage of the Informix Dynamic Server optimizer via update statistics

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Make full use of the database optimizer through the UPDATE STATISTICS statement in Ibm®informix®dynamic Server (IDS). Read the brief description of this SQL statement in this article to learn how to use it to solve a variety of different problems. Discover the importance of updating statistics, and learn how to collect statistical information. Finally, look at the last FAQ section of this article for answers to your questions about this important SQL statement.

Brief introduction

Update STATISTICS is a dedicated Informix SQL command that analyzes the data and stores that information in the system catalog, updating the actual information about each table and its columns, which can be used to estimate the cost of subsequent queries. To really understand the importance of the UPDATE STATISTICS, you need to understand what happens when the user enters a SQL query to execute. Each SQL query that you enter must be parsed, optimized, and executed.

The optimizer is the component that is used to prepare the query plan. Ideally, a query plan is the best plan for executing a given query-that is, it determines the best way to crawl data. To do so, it uses a collection of statistical data, but this statistic is not necessarily accurate. The accuracy of this data depends on a number of factors, such as the type of sampling algorithm used, the number of samples and the skew of the data.

The query optimizer does not automatically recalculate the table's configuration file. In some cases, it may take more time to gather statistics than to execute the query. To ensure that the query plan selected by the optimizer can best reflect the current state of the table, UPDATE STATISTICS should be run periodically.

The working principle of UPDATE STATISTICS

After the initial load of data and the creation of an index, you should run UPDATE STATISTICS. Also, you should run this command whenever you make significant changes to a database table, including large-scale inserts, updates, or deletions. If UPDATE STATISTICS is not running, the optimizer can only use inaccurate data to determine the access path.

Now, learn how the command works. The query optimizer estimates the cost of the query based on the number of rows retrieved from each table. The estimated number of rows depends on the selection rate of each conditional expression used in the WHERE clause. A filter is a conditional expression that is used to select a row. The selection rate is a value between 0 and 1 that represents the percentage of rows in the table that can pass through the filter. The selection rate of the filter, which has only a few rows, tends to be 0, and its selection rate tends to be 1 for filters that can pass almost all rows.

The optimizer can use the data distribution to calculate the selection rate of filters in a query. However, if there is no data distribution, the database server calculates the selection rate of different types of filters based on the table index.

The accuracy of the selection rate valuation has a significant impact on the cost of each execution plan. Therefore, the accuracy of getting the best plan depends entirely on the most recent statistics about the database objects involved in the query.

The following system catalog tables are refreshed whenever the UPDATE STATISTICS query is run. The columns in the table are listed under each heading.

In the Systables

Nindexes-Number of indexes in the table

nrows-Number of rows in the table

npused-Number of pages used to store Tupule ' nrows '

In the syscolumns

Colmin-Secondary decimal value of a column

Colmax-the secondary large value of the column

In sysindexes and sysfragments.

The series in the levels-b-tree

The number of leaves contained in the leaves-b-tree

Nunique-Number of different values

Clust-the degree of clustering associated with a table

In the Sysdistrib

The row that contains the distribution information for the value of the User data table. (so all columns are affected)

It also maintains the resolution and confidence of the distribution

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Take full advantage of the Informix Dynamic Server optimizer via update statistics

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Take full advantage of the Informix Dynamic Server optimizer via update statistics

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support