Take full advantage of the Informix Dynamic Server optimizer via update statistics

Source: Internet
Author: User
Tags filter expression informix types of filters advantage

Make full use of the database optimizer through the UPDATE STATISTICS statement in Ibm®informix®dynamic Server (IDS). Read the brief description of this SQL statement in this article to learn how to use it to solve a variety of different problems. Discover the importance of updating statistics, and learn how to collect statistical information. Finally, look at the last FAQ section of this article for answers to your questions about this important SQL statement.

Brief introduction

Update STATISTICS is a dedicated Informix SQL command that analyzes the data and stores that information in the system catalog, updating the actual information about each table and its columns, which can be used to estimate the cost of subsequent queries. To really understand the importance of the UPDATE STATISTICS, you need to understand what happens when the user enters a SQL query to execute. Each SQL query that you enter must be parsed, optimized, and executed.

The optimizer is the component that is used to prepare the query plan. Ideally, a query plan is the best plan for executing a given query-that is, it determines the best way to crawl data. To do so, it uses a collection of statistical data, but this statistic is not necessarily accurate. The accuracy of this data depends on a number of factors, such as the type of sampling algorithm used, the number of samples and the skew of the data.

The query optimizer does not automatically recalculate the table's configuration file. In some cases, it may take more time to gather statistics than to execute the query. To ensure that the query plan selected by the optimizer can best reflect the current state of the table, UPDATE STATISTICS should be run periodically.

The working principle of UPDATE STATISTICS

After the initial load of data and the creation of an index, you should run UPDATE STATISTICS. Also, you should run this command whenever you make significant changes to a database table, including large-scale inserts, updates, or deletions. If UPDATE STATISTICS is not running, the optimizer can only use inaccurate data to determine the access path.

Now, learn how the command works. The query optimizer estimates the cost of the query based on the number of rows retrieved from each table. The estimated number of rows depends on the selection rate of each conditional expression used in the WHERE clause. A filter is a conditional expression that is used to select a row. The selection rate is a value between 0 and 1 that represents the percentage of rows in the table that can pass through the filter. The selection rate of the filter, which has only a few rows, tends to be 0, and its selection rate tends to be 1 for filters that can pass almost all rows.

The optimizer can use the data distribution to calculate the selection rate of filters in a query. However, if there is no data distribution, the database server calculates the selection rate of different types of filters based on the table index.

The accuracy of the selection rate valuation has a significant impact on the cost of each execution plan. Therefore, the accuracy of getting the best plan depends entirely on the most recent statistics about the database objects involved in the query.

The following system catalog tables are refreshed whenever the UPDATE STATISTICS query is run. The columns in the table are listed under each heading.

In the Systables

Nindexes-Number of indexes in the table

nrows-Number of rows in the table

npused-Number of pages used to store Tupule ' nrows '

In the syscolumns

Colmin-Secondary decimal value of a column

Colmax-the secondary large value of the column

In sysindexes and sysfragments.

The series in the levels-b-tree

The number of leaves contained in the leaves-b-tree

Nunique-Number of different values

Clust-the degree of clustering associated with a table

In the Sysdistrib

The row that contains the distribution information for the value of the User data table. (so all columns are affected)

It also maintains the resolution and confidence of the distribution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.