BKJIA Summary: in the previous article, we saw the performance optimization of the DB2 utility. This time we will focus on the performance considerations for the DB2 concurrent connection.
In general, when the number of connections is small, the performance of db2 will be relatively stable. At this time, the connected application generates fewer requests than the coordination proxy in the db2 proxy pool. In this case, each request can be responded to and processed by the timely coordination proxy. When the connection concentrator is activated (MAX_CONNECTIONS> MAX_COORDAGENTS), if the number of connections exceeds the coordination proxy, the connection requests will enter the queue waiting for the Coordination proxy service, the number of concurrent connections increases, but the performance of some connections decreases significantly. In this case, we should consider activating the multi-range parallel (SMP) or multi-partition (MPP) feature to increase the I/O concurrency and parallel computing of multiple CPUs.
Case Analysis
Query Optimization Cases
Next, let's take a look at the detailed optimization steps and data during the DML operation. First, let's look at a query optimization example. The following is the table creation statement in the experiment:
- CREATE TABLE MCLAIM.T1_DMS (
- C11 VARCHAR (10) NOT NULL ,
- C12 VARCHAR (15) NOT NULL ,
- C13 VARCHAR (20) NOT NULL ,
- CONSTRAINT C11_PK PRIMARY KEY ( C11) ) IN DMS_Space;
- CREATE TABLE MCLAIM.T2_DMS (
- C21 VARCHAR (15) NOT NULL ,
- C22 VARCHAR (25) NOT NULL ,
- C23 VARCHAR (30) NOT NULL ,
- CONSTRAINT C21_PK PRIMARY KEY ( C21) ) IN DMS_Space;
- CREATE TABLE MCLAIM.T3_DMS (
- C31 VARCHAR (10) NOT NULL ,
- C32 VARCHAR (25) NOT NULL ,
- C33 VARCHAR (35) NOT NULL ,
- CONSTRAINT C31_PK PRIMARY KEY ( C31) ) IN DMS_Space;
The initial environment was not optimized, the tablespace type was SMS tablespace, The queried table was not indexed, and sortheap was too small. In this case, run the following query statement:
- select C12 from TESTOPT.T1_SMS,%SCHEMA%.T2_SMS,%SCHEMA%.T3_SMS
- where substr(C12,1,10)=substr(C21,1,10) and C22=C32
- order by C12 asc
Without optimization, the total execution time is 653 seconds, and after optimization, the total execution time is about 15 seconds. The following optimization steps are used:
Select the DMS tablespace.
Add an index:
- CREATE UNIQUE INDEX INDEX_C12 on T1_DMS (C12 ASC);
- CREATE UNIQUE INDEX INDEX_C22 on T2_DMS (C22 ASC);
- CREATE UNIQUE INDEX INDEX_C32 on T2 _DMS (C32 ASC);
Increase sortheap size
Run runstats
Select an appropriate optimization level
Improve the table structure and add redundant fields. Change Time with space:
- ALTER TABLE T1 ADD C12_Red VARCHAR(10);
- ALTER TABLE T2 ADD C21_Red VARCHAR(10);
- UPDATE T1 SET C12_Red=SUBSTR(C12,1,10);
- UPDATE T2 SET C21_Red=SUBSTR(C21,1,10);
The query statement is changed:
- select C12 from TESTOPT.T1_DMS, TESTOPT.T2_DMS, TESTOPT.T3_DMS
- where C12_Red=C21_Red and C22=C32 order by C12 asc
Figure 1. query Operation Optimization
It can be seen that the selected tablespace type (Database Management tablespace) and the addition of indexes will greatly improve the performance. Adding redundant fields has the greatest effect on performance improvement. Of course, this will involve changes in the table structure, which must be considered in the database design stage. The cost is to increase disk space.
Write operation optimization
Next is an example of a write operation (insert ). The following is the test script:
- CONNECT TO FFTEST;
- CREATE SCHEMA TESTOPT;
- DROP TABLE TESTOPT.T3;
- CREATE TABLE TESTOPT.T3 (
- C31 VARCHAR (10) NOT NULL ,
- C32 VARCHAR (15) NOT NULL ,
- CONSTRAINT C31_A CHECK ( C31 LIKE 'A%' or C31 LIKE 'a%'));
- CREATE INDEX TESTOPT.INDEX_C31 on TESTOPT.T3 (C31 ASC);
- ALTER TABLE TESTOPT.T3 ADD CONSTRAINT C31_A CHECK (substr(C31,1,1)= ’ a ’
- or substr(C31,1,1)= ’ A ’ )
- ALTER TABLE TESTOPT.T3 APPEND OFF;
- CONNECT RESET;
It took about 68 seconds to insert 40 thousand records because the initial table was not optimized, including indexes and constraints. After optimization, it took only 6 seconds to insert 40 thousand records. The optimization steps are as follows:
Figure 2. Insert Operation Optimization
It can be seen that reducing indexes and constraints can greatly improve the insert performance, and merging multiple insert statements into one row is more effective.
Performance Tuning considerations
To achieve high performance, the buffer pool is too large, causing the database to be unable to connect. This may be a disaster for inexperienced users, which means that the database may need to be rebuilt. We made such a mistake at first. Now you can adjust the DB2 registration parameter DB2_OVERRIDE_BPF to set the buffer pool size so that you can connect to the database again. Of course, it is best to activate STMM so that the memory can be automatically adjusted.
The roles of runstats and reorg are often ignored. We find that more than one performance problem is caused by incorrect access plan selected by the optimizer, which leads to a reduction in the overall system performance. The external display is not only slow SQL Execution, but also an I/O bottleneck or a long system response time. This often leads us to analyze other places. But the root cause is many times the optimizer error. These problems are often solved after runstats and reorg are re-executed. Therefore, pay special attention to these two commands.
Index factors are often ignored during data loading, leading to performance loading performance degradation. We have encountered such an example. It took five minutes to import 1000 records to a table. After checking that many configurations could not be found, we found that the table had one primary key, there are also four foreign keys. It takes only a few seconds to re-import them after they are deleted. Therefore, when performing load or insert operations, try to delete the primary and Foreign keys or related indexes as much as possible. After loading, rebuild the relevant indexes. The primary foreign key tries to ensure its data integrity by loading the program. This is often ignored. Therefore, check the index status and reference relationship of all tables before loading data.
When modifying db2 parameters, it is best to modify one parameter at a time, and then look at the effect, while adjusting other parameters. Otherwise, when multiple parameters are set at a time, it is not clear which parameter is used. You have to perform the operation again next time. Note that not all parameters are bigger, better, and sometimes counterproductive.
Pay attention to the index trial. The performance improvement of the optimized index on the query statement is usually dozens of times better. Therefore, you can check the index utilization of related statements before tuning. This allows you to check SQL statements and execution plans to see if existing indexes are used or whether new indexes need to be created. This is often more important than DB2 system tuning. But remember to consider the insert operation, the index will also reduce the insert performance. This should be considered comprehensively.
Because XML data can be stored across pages, you should try to use a large data page when designing an XML database. This can avoid cross-page queries of XML data to improve query performance.
Table Partitioning: In this example, a user has a table with a large data volume and generates about 0.3 million records every day. At the same time, records generated five days ago are deleted every day, therefore, this table has about 1.5 million records. Now, the customer needs to re-index the table during the first query every day (because a lot of data is generated at night, so no index is created for the newly added data, resulting in a very slow response! To solve this problem, we later used table partitions and used six partition tables to load data from the original six days. Therefore, queries and inserts only involve one table, so the response speed is greatly improved.
Understand the CHNGPGS_THRESH parameter, which is the threshold value for log writing in the buffer pool. In one example, the index creation process is slow. Check that the CHNGPGS_THRESH parameter is too large, resulting in a large amount of data each time the log is written, resulting in an I/O bottleneck and appropriately reducing the value of this parameter, you can increase the number of logs written, but reduce the amount of data written each time. This is effective for creating indexes for large tables in the large buffer pool.
When importing data, use load as much as possible, and use less import. We have made statistics. Using import takes 10 minutes, and using load takes about 1 minute, which greatly improves the work efficiency.
Note the size of db2diag. log. When this file is large, all database operations, including stopping and starting db2, are particularly slow and sometimes even suspended. So you should always check the size of this file. If it is too large, you 'd better delete it and restart db2. Of course, DIAGLEVEL should not be set too high unless more information is obtained to diagnose a problem. Generally, the default value of 3 is enough.