DB2 index problems

Source: Internet
Author: User

Some time ago, as a project, it was plagued by the query efficiency of the database. The database used was DB2 8.2. The details are as follows:

Table A (a_id, a_title, a_addr,...) has more than 50 fields, more than 2 million records, and is about 2 GB in size.
Table B (lib_id, B _id) records the_id record set corresponding to B _id. Each lib_id corresponds to about 50 thousand a table records. Currently, lib_id9 records, with a total of about 0.5 million records, in the future, it may increase to 50 lib_id records, and the number of records reaches 2.5 million.

For example, the following statement is used to query the connections between tables A and B: (about 361 records corresponding to B. lib_id = 40 thousand)

(1) Select count (*) from a, B where a. a_id = B. B _id and B. lib_id = 361
(2) Select count (*) from a, B where a. a_id = B. B _id and A. a_addr = 'China' and B. lib_id = 361
(3) Select count (*) from a, B where a. a_id = B. B _id and A. a_addr like '% %' and B. lib_id = 361
(4) Select count (*) from a, B where a. a_id = B. B _id and A. a_addr like 'ningbo % 'and B. lib_id = 361

Three indexes a_id, a_title, and a_addr are created for a respectively, and a composite index (B _id, a_id) is created for table B. Results:

(1) Fast <1 s
(2), (3), (4) Slow, more than 90 s

A hard disk is added to the machine, and the tablespace in Table A is evenly distributed to two hard disks, in the hope of parallel processing.
However, the poor performance may be caused by the fact that the attached hard disk was too old (an old hard disk a few years ago ). If both are SCSI hard disks, the performance should be improved.
We recommend that you use an MDC table based on the DB2 index design program. We recommend that you use a partition table (both DB2 9.0 and orcal ).

However, the problem was basically solved yesterday, And neither MDC table nor partition table was used. It also proves that it is not a server performance problem.
The solution is simple:Create several composite indexes for table A, namely (a_title, a_id), (a_addr, a_id ).Then perform the test

Test command: db2batch-D dbname-a username/password-F sqlfilename

(1)Select count (*) from a, B where a. a_id = B. B _id result set 0.4 million 1.14 s
(2)Select count (*) from a, B where a. a_id = B. B _id and B. lib_id = 361 result set 40 thousand 0.72 s
(3)Select count (*) from a, B where a. a_id = B. B _id and A. ADDR like 'ningbo % 'and B. lib_id = 361 result set 20 thousand 7.63 s
(4)Select count (*) from a, B where a. a_id = B. B _id and A. ADDR like '% %' and B. lib_id = 361 result set 20 thousand 1.6 s
(5)Select count (*) from a, B where a. a_id = B. B _id and A. ADDR like 'ningbo % 'result set 0.248 million 7.7 s
(6)Select count (*) from a, B where a. a_id = B. B _id and A. ADDR like '% %' result set 0.248 million 1.86 s

Compare (3), (4), (5), (6) two groups of data, we found that the speed of like '% Ningbo %' is several times faster than that of like 'ningbo %!

After researchAdds an index lib_id to table B., The execution time of result (3) is reduced to 0.67 S, and the time of (4) is 1.55 s.But how can we explain the huge difference between statements (5) and (6? In addition, the cost of (5) in the execution plan provided by DB2 is only about 20% of (6!

After repeated experiments, we found that the possible cause is that the result set of like 'ningbo % 'is too large, because the result set of like 'zhejiang %' in other queries is around 4 K, it is indeed much more efficient than like '% Zhejiang %.


As we thought before, when our table had only more than 0.2 million records, it would be fast to create indexes. we can sum up the following experience:

1. Indexing of large tables is very important. server configuration has little impact on the performance of millions of databases.
2. For connection queries, a composite index should be created.
3. When the result set is large, the efficiency of like '%... %' may be much higher than that of like '... %. (To be theoretically verified)

 

 

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.