What are the solutions to optimize the performance of the DB2 application system?

Source: Internet
Author: User

This article mainly describes how to optimize the performance of DB2 application systems. We all know that DB2 database is a high-performance large-scale relational DB2 management system, it is widely used in the customer/server architecture. The criteria for evaluating system performance optimization include:

Throughput, response time, and parallel capability. This article discusses the optimization of system performance from the perspectives of database design, query optimization, concurrency control, and customer/Server mode.

Design Database

1. Familiar with business systems

Familiarity with the business system has a great impact on the performance of the entire DB2 application system. A designer who is unfamiliar with the business, despite having a wealth of database knowledge, it is also difficult to design a database application system with the best performance.

2. Normalization and non-Standardization

After the database is normalized, data redundancy is reduced, data volume becomes smaller, and data rows become narrower. In this way, each page of DB2 can contain more rows, so there is more data in each partition, which accelerates table scanning and improves the query performance of a single table. However, when querying multiple tables, you need to use a lot of Join Operations to combine the information from each table, resulting in higher CPU and I/O costs.

In most cases, we need to maintain a balance between normalization and non-standardization, use appropriate redundant information to reduce system overhead, and use the space price in exchange for the time cost. There is an order information table OrderDetail, which records the courier information, cashier information, item information, price policy, customer information ..... The information is stored in the courier information table, cashier information table, item information table, price policy table, and customer information table.

According to the standardization requirements, OrderDetail must be connected to so many tables or nested queries. If the data volume in the OrderDetail table is millions, the time required for a query may be several hours. In fact, as long as the logic validity of the data is ensured during the design, a lot of information can be directly redundant in the OrderDetail table. These redundant data can greatly improve the query efficiency, this reduces CPU and I/O operations.

3. Data striping

If the number of records in a table exceeds a certain size, the most basic query operation will also be affected. You need to divide the table according to the date level, divides recent and most frequently used data into historical and infrequently used data, or by geographic location or department. There is also a way of partitioning-vertical partitioning, that is, dividing a table with many attribute columns into several small tables, such as placing frequently used attributes in a table, attributes that are not frequently used are placed in another table, which can speed up table scanning and improve efficiency.

4. Select a data type

The data type selected for each attribute depends on the requirements of the table to a large extent. However, selecting an appropriate data type without violating the requirements of the table can improve the system performance. For example, a text column stores the information of a book, BLOB instead of character (1024), BLOB stores pointer or file reference variables, and real text information can be stored outside the database, this reduces the database storage space and increases the running speed. DB2 provides the UDTUser Defined PES ypes function. You can define your own data types as needed.

5. Select an index

An index is an important data structure in a database. Its fundamental goal is to improve query efficiency. Currently, most database products adopt the ISAM index structure first proposed by IBM. You can use indexes to quickly, directly, and orderly access data. Although indexing accelerates queries, it also reduces the data update speed, because new data must be added not only to tables, but also to indexes. In addition, indexing requires additional disk space and maintenance costs. Therefore, you must use the index properly:

Indexes are created on Attribute columns that are frequently connected but not specified as foreign keys.

Create an index on a column that is frequently sorted or grouped by or order. Sorting or grouping by index can improve the efficiency.

Create a search for columns with different values that are frequently used in conditional expressions. Do not create an index for columns with fewer values.

If there are multiple columns to be sorted, you can create a compound index on these columns), that is, the index is composed of multiple fields.

Query Optimization

Currently, database products have improved the query optimization of DB2 application systems. However, because SQL statements submitted by users are the basis of system optimization, it is hard to imagine that a bad query plan will become efficient after the system is optimized. Therefore, the merits and demerits of the statements written by users are crucial. The following describes solutions to improve the user query plan.

1. Sort LaG \ u0014ERCx \ u001F

In many cases, duplicate sorting of large tables should be simplified or avoided. When indexes can be used to automatically generate outputs in an appropriate order, sorting steps can be avoided. When the following situations occur, sorting cannot be omitted:

The index does not contain one or more columns to be sorted;

The order of columns in the group by or order by clause is different from that of the index;

Sort columns from different tables.

In order to avoid unnecessary sorting, We need to correctly add indexes and reasonably merge database tables. Although Table normalization may be affected, it is worthwhile to Improve the efficiency. If sorting is unavoidable, you should try to simplify it, such as narrowing the range of sorting columns.

2. Primary Key

Using an integer type for a primary key greatly improves query efficiency, and the comparative overhead for a primary key type is much larger than that for an integer type. Using primary data as a primary key will reduce the efficiency of data insertion, update, and query. This reduction may not be noticed when the data volume is small, but when the data volume is large, small improvements can also improve the response speed of the DB2 application system.

3. nested Query

In SQL, a query block can be used as an operand OF THE predicates in another query block. Therefore, SQL queries can be nested. For example, in a large distributed database system, the Order table Order and Order information table OrderDetail are available. If two tables need to be associated for query:

 
 
  1. SELECT  
  2. FROM Order  
  3. WHERE OrderNo  
  4. ( SELECT OrderNo  
  5. FROM OrderDetail  
  6. WHERE Price=0.5)  

In this query, find out the list of subscriber whose unit price is 0.5 yuan. The lower-level query returns a group of values to the upper-level query, and then the upper-level query blocks continue to query based on the values provided by the lower-level blocks. In this nested query, each OrderNo value in the upper-level query and all lower-level queries must scan the OrderDetail table, and the execution efficiency is obviously not high.

This query contains two layers of nesting. IF 1000 rows are queried at each layer, 1 million rows of data are queried. In the DB2 Application System overhead, 82% of table Order scans and 16% of table OrderDetail searches. If we use a connection instead, that is:

 
 
  1. SELECT CreateUser  
  2. FROM Order,OrderDetail  
  3. WHERE Order.OrderNo=OrderDetail.OrderNo AND Praice=0.5  

Therefore, the table Order scan accounts for 74%, and the table OrderDetail search accounts for 14%.

In addition, the label of a column appears in both the primary query and the query in the where clause. It is very likely that after the column value in the primary query changes, the subquery must perform a new query. The more nested query layers, the lower the efficiency. Therefore, avoid subqueries as much as possible. If the subquery is unavoidable, filter as many rows as possible in the subquery.

4. wildcard characters

In SQL statements, the LIKE keyword supports wildcard matching, but this matching is especially time-consuming. For example, SELECT * FROM Order WHERE CreateUser LIKE'm ___'. Even if an index is created on the CreateUser field, the sequence scan is used in this case. If the Order table contains 1000 records, it needs to be compared for 1000 times. If you change the statement to SELECT * FROM Order WHERE CreateUser> 'M' AND CreateUser <'n', the query will be executed using indexes, which will obviously increase the speed.

5. distinct

Distinct is used to ensure that duplicate values are not displayed in the result set, but distinct generates a worksheet and sorts it to delete duplicate records, which greatly increases the number of queries and I/O operations. Therefore, do not use the distinct keyword.

6. Negative Logic

The negative logic is like! =, <>, Not in, and so on will cause DB2 to use table scanning to complete the query. When the table is large, the system performance is seriously affected and can be replaced by other operations.

7. Temporary table

When a temporary table is used, the database will create a corresponding data structure in the disk, because the memory access speed is much higher than the access speed of external memory, when a temporary table is used in complex queries, the intermediate results will be imported into the temporary table, which greatly reduces the query efficiency. In addition, in a distributed system, the use of temporary tables also results in synchronization between multiple query processes. Therefore, it is best not to use temporary tables for complex queries.

8. Stored Procedure

The Stored Procedure Builder in DB2 can generate a Stored Procedure, run and test the Stored Procedure. Stored Procedures can contain massive and complex queries or SQL operations, which are compiled and stored in the DB2 database. When you use the same SQL operation for multiple times, you can first make these SQL operations into a stored procedure and directly reference its name for calling where necessary.

During the first execution of the stored procedure, an optimized query scheme is established. DB2 stores the query scheme in the cache, and can directly execute the scheme from the cache when calling the runtime, the optimization and compilation stages are eliminated, saving execution time, improving efficiency and System utilization.

The optimal query scheme is usually not feasible according to certain criteria, and should be selected through comparison based on actual requirements and specific situations. The Query Patroller provided by DB2 can compare the Query costs of different Query schemes. By tracing Query statements, it returns the DB2 Application System overhead of different stages of Query to make the best choice. The Performance Monitor provided by DB2 also monitors the Performance of the entire database system, including I/O time, queries, sorting time, and synchronization read/write time.

The concurrency control of the database system can also affect the system performance. Simultaneous operations by multiple users may result in data inconsistency. To prevent data loss and access to uncommitted data due to simultaneous modifications, and to protect read data, DB2, use the Lock mechanism to implement control.

DB2 can lock table spaces, tables, columns, and indexes. The larger the lock granularity, the simpler the lock, the smaller the overhead, and the lower the concurrency; the smaller the granularity, the complicated lock mechanism, the large overhead, and the high concurrency. If the resources to be allocated are locked during concurrent processing, the DB2 Application System suspends the process.

If a time-consuming query operator runs on a frequently used table, using the table level-1 lock means that the entire system can continue to run after your query is complete. Therefore, avoid using table-level locks in complex queries. What should I do if there is a need for this category? View can be used to solve this problem. The view avoids direct operations on the table and ensures the efficient operation of the database.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.