Considerations for the initial optimization and Design of DB2 Databases

Last Update:2013-12-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Guidance:For the performance of an applicationDatabasePerformance is an important factor. Since the application and its related data will always change over time, it is necessary to constantly optimize the database to maintain the best level. However, the optimization efforts should be within a reasonable range. There should be a degree of optimization. All efforts beyond this degree can only have a negative impact. If the performance of an application is not satisfactory, consider other alternatives, such as moving the application to a faster platform. The following is an explanation.DB2 database initial optimizationAndDesignFor example, we will explain how to optimize the database.

The commands and syntax mentioned in this article are based on DB2 UDB V7. If you are using DB2 UDB V8, it may be slightly different.

Database Design Considerations

Database optimization begins with the design phase. Assuming that the hardware selection is based on other considerations, the first thing to decide is the storage architecture. The more drives DB2 uses, the faster it will, the better the potential performance. You should carefully plan the locations of tablespaces and other objects (logs, backup files, and so on. It is particularly important to ensure that the log and backup are on different drives as much as possible. This is not only for performance, but also for ease of recovery.

Table space design is an important part of the entire database design. You can create more than one user tablespace to improve performance. In the following three cases, it is useful to use multiple tablespaces:

Control I/O, if these tablespaces can be located on different drives.
Use pagesize ).
Control the buffer pool.
In most cases, isolated tablespaces are created for indexes and large objects. Creating more than one tablespace with the same page size does not have any benefit.

Compared with system-managed tablespaces, Database-managed tablespaces (especially on raw devices) provide better performance. When deciding on the page size, remember that DB2 can only put a maximum of 255 rows on one page, and the remaining space will not be used. For example, if the average length of a row is 50 bytes, the maximum space used for a page is 50*255 = 12750 bytes. If you place the table in a tablespace with a page size of 16 K or 32 K, some pages will be wasted. Otherwise, if some tables have longer rows or many columns (for specific restrictions, see the create table statement in the SQL Reference Manual), the page size must be greater than 4 K. If you want to access data in a continuous way (such as a cluster Table), you can use a larger page size to achieve better performance. On the contrary, if the data is accessed in a random manner, it is best to use the page size as small as possible.

Each tablespace is associated with a buffer pool with the same page size (one buffer pool can be associated with more than one tablespace ). Exercise caution when using multiple buffer pools. Because the available storage is limited, allocating too much space for a buffer pool will inevitably reduce the width of other buffer pools, resulting in a reduction in overall performance. Buffer pool optimization is best performed on the basis of database performance and benchmark detection. DB2 is good at dynamically managing available space. Therefore, using a minimum number of buffer pools in most cases can provide better performance.

For a long time, the importance of table design lies in standardization. Non-redundant data occupies the least space and has the best integrity. However, non-redundant data does not provide the best performance. To eliminate a little bit of redundancy, you need to create additional tables, which makes it necessary to combine the additional tables during the query to increase the complexity of the query. Correct judgment is required to balance these two needs. Generally, you can increase performance by generating redundant data, but this requires a constrained approach, that is, the redundant data must take the form of an index and a summary table. If you need to frequently access the summary data, the latter can significantly increase the performance. The refresh frequency should be evaluated based on the freshness of the information to be kept.

Index is one of the most important aspects of performance optimization. Generally, table access is based on some standards. Indexing Based on some columns that constitute these standards can dynamically reduce query-related overhead. A small number of indexes (one or two) should be created for unstable tables maintained online. For large historical tables, you need to query them in multiple ways, you need to create many indexes. The number of columns in an index should be as few as possible, unless many queries can be completed through an "index only" search. For this purpose, the INCLUDE option allows other fields to be appended to the index, and its overhead is smaller than the full index method. You can select an index of a table as a cluster index or specify the index in the REORGANIZE command. Table data remains in the order specified by the index. This method is useful when a large number of queries access a large number of rows based on this index. Indexes are usually placed in their own tablespace and have their own buffer pool to prevent index pages from being squeezed out when the number of data pages is large.

Application Design Considerations

Application Design also affects databases. The first step is to ensure that the application only requires the Database Manager to do the necessary work. For example, using SELECT * to request all columns can speed up the program to a certain extent, but this reduces the performance because additional data movement is required, the "index only" scan is blocked. The query contains unnecessary clauses, such as order by or DISTINCT, which is an example of requesting the Database Manager to do additional work. If the order of columns does not affect the running of the application, you can save the time spent in sorting.

The locking feature is very important for increasing the throughput of the database. Even for read-only transactions, commit is of great importance because locks are also required for such transactions. It is important to select a proper isolation level. Use the lowest possible isolation level, as long as the application can run at this level. For locks, the use of the Repeatable read isolation level is extremely expensive and reduces concurrency. As long as you do not intend to update the result set, you should include the for read only clause. This ensures that the exclusive lock is not obtained. The for update clause removes the need to obtain a higher lock. In some environments, obtaining a table lock through the application before a query can prevent many row locks from being obtained, thus preventing the gradual upgrade of the lock.

Query Optimization is another way to save a lot of resources. The optimization level can be set through the Database Configuration Parameter dft_queryopt. In addition, you can reset static SQL statements by using PREP and BIND commands. In dynamic SQL statements, You can reset them using the SET CURRENT QUERY OPTIMIZATION statement. For complex queries, 5th or higher levels may be required. You can use the db2batch tool to evaluate the time spent on compiling and executing SQL statements. As for the results, remember that static SQL statements are usually compiled once and executed multiple times. dynamic SQL statements are also the same, because the results must be cached.

Initial Optimization

After creating a database and a tablespace, you can use Performance Wizard to set the initial database configuration. Select the database and the "Configure Performance Using Wizard" option. This will allow faster data loading. After creating a data object, you should load the data.

The first step of optimization is to use the RUNSTATS command to collect statistics. To obtain the complete set of statistics, you should specify the "with distribution and indexes all" option. RUNSTATS should be a common part of database maintenance. RUNSTATS should be called regularly based on the database update rate (daily, weekly, and monthly. If you make a big change to the data (load or delete a large number of rows), you should also run the RUNSTATS command. Statistics can be used to determine which access plan is most effective for a query. After running the RUNSTATS command, the affected packages should be rebound.

After that, execute Performance Wizard again to specify the database to be filled in. Performance Wizard changes some database configuration parameters. If these problems are solved correctly, the value generated by Performance Wizard is generally ideal. Performance Wizard allows you to restore previous configurations (if any) on the first screen ). Read the comments in the window carefully. The last screen displays the old and new database configurations and highlights all changes in bold.

If the performance is unsatisfactory during testing, you should use Database System Monitor (see System Monitor Guide and Reference for details) or reduce the problem source to several transactions, to find out the cause of the problem. For specific queries, interpretation tools (see the Administration Guide for details) provide valuable information about possible causes of performance issues. Based on this information, you can change the index structure or database parameters.

Conclusion

Previous discussions highlighted some of the main considerations for tuning. Tuning is an iterative process. Over time, data and applications in the Database need to be changed. In this case, performance should be checked and the database should be changed to meet new needs. There must be a benchmark for database configuration parameters or other changes. In some cases, some changes seem correct, but they actually have a negative impact on the database. We hope that you will be able to have a good grasp of the DB2 database initial optimization and design knowledge through the above learning.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Considerations for the initial optimization and Design of DB2 Databases

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Considerations for the initial optimization and Design of DB2 Databases

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support