DB2 is a high-performance large-scale relational database management system, which is widely used in the customer/server architecture. The criteria for evaluating system performance optimization include throughput, response time, and parallel capability.
Design Database
1. Familiar with business systems
The degree of familiarity with the business system has a great impact on the performance of the entire database system. A designer who is unfamiliar with the business, despite having a wealth of database knowledge, it is also difficult to design a database application system with the best performance.
2. Standardization and non-Standardization
After the database is normalized, data redundancy is reduced, data volume becomes smaller, and data rows become narrower. In this way, each page of DB2 can contain more rows, so there is more data in each partition, which accelerates table scanning and improves the query performance of a single table. However, when querying multiple tables, you need to use a lot of Join Operations to combine the information from each table, resulting in higher CPU and I/O costs. In most cases, we need to maintain a balance between normalization and non-standardization, use appropriate redundant information to reduce system overhead, and use the space price in exchange for the time cost. There is an order information table OrderDetail, which records the courier information, cashier information, item information, price policy, customer information ..... The information is stored in the courier information table, cashier information table, item information table, price policy table, and customer information table. According to the standardization requirements, OrderDetail must be connected to so many tables or nested queries. If the data volume in the OrderDetail table is millions, the time required for a query may be several hours. In fact, as long as the logic validity of the data is ensured during the design, a lot of information can be directly redundant in the OrderDetail table. These redundant data can greatly improve the query efficiency, this reduces CPU and I/O operations.
3. Data striping
If the number of records in a table exceeds a certain size, the most basic query operation will also be affected. You need to divide the table according to the date level, divides recent and most frequently used data into historical and infrequently used data, or by geographic location or department. There is also a way of partitioning-vertical partitioning, that is, dividing a table with many attribute columns into several small tables, such as placing frequently used attributes in a table, attributes that are not frequently used are placed in another table, which can speed up table scanning and improve efficiency.
4. Select a data type
The data type selected for each attribute depends on the requirements of the table to a large extent. However, selecting an appropriate data type without violating the requirements of the table can improve the system performance. For example, a text column stores the information of a book, BLOB instead of character (1024), BLOB stores pointer or file reference variables, and real text information can be stored outside the database, this reduces the database storage space and increases the running speed. DB2 provides the UDT (User-Defined PES ypes) function. You can define your own data types as needed.
5. Select an index
An index is an important data structure in a database. Its fundamental goal is to improve query efficiency. Currently, most database products adopt the ISAM index structure first proposed by IBM. You can use indexes to quickly, directly, and orderly access data. Although indexing accelerates queries, it also reduces the data update speed, because new data must be added not only to tables, but also to indexes. In addition, indexing requires additional disk space and maintenance costs. Therefore, you must use the index properly:
◆ Indexes are created on Attribute columns that are frequently connected but not specified as foreign keys.
◆ Create an index on columns that are frequently sorted or grouped (that is, group by or order by operations. Sorting or grouping by index can improve the efficiency.
◆ Create searches for columns with different values that are frequently used in condition expressions. Do not create indexes for columns with fewer values.
◆ If there are multiple columns to be sorted, you can create a compound index on these columns, that is, the index is composed of multiple fields.
Query Optimization
Currently, database products have improved in terms of system query optimization, but since SQL statements submitted by users are the basis of system optimization, it is hard to imagine that a bad query plan will become efficient after the system is optimized. Therefore, the merits and demerits of the statements written by users are crucial. The following describes solutions to improve the user query plan.
1. Sorting
In many cases, duplicate sorting of large tables should be simplified or avoided. When indexes can be used to automatically generate outputs in an appropriate order, sorting steps can be avoided. When the following situations occur, sorting cannot be omitted:
◆ The index does not contain one or more columns to be sorted;
◆ The order of the columns in the group by or order by clause is different from that of the index;
◆ Sort columns from different tables.
In order to avoid unnecessary sorting, We need to correctly add indexes and reasonably merge database tables. Although Table normalization may be affected, it is worthwhile to Improve the efficiency. If sorting is unavoidable, you should try to simplify it, such as narrowing the range of sorting columns.
2. Primary Key
Using an integer type for a primary key greatly improves query efficiency, and the comparative overhead for a primary key type is much larger than that for an integer type. Using primary data as a primary key will reduce the efficiency of data insertion, update, and query. This reduction may not be noticed when the data volume is small. However, when the data volume is large, small improvements can also increase the system response speed.
3. nested Query
In SQL, a query block can be used as an operand OF THE predicates in another query block. Therefore, SQL queries can be nested. For example, in a large distributed database system, the Order table Order and Order information table OrderDetail are available. If two tables need to be associated for query:
SELECT CreateUser FROM Order WHERE OrderNo IN ( SELECT OrderNo FROM OrderDetail WHERE Price=0.5)
|