Concept and understanding of DB2 parallel placement

Source: Internet
Author: User

The following sections describe the concept of DB2 consortium and provide a correct understanding of DB2 consortium. We all know that collocation is used to place rows in different tables of the same database partition. The parallel table enables DB2 to use the connection policy more effectively.

You may find that as a response to a specific query, two or more tables provide data frequently. In this case, you want the data in such a table to be as close as possible. In an environment where a database is physically divided into two or more database partitions, there must be a way to keep the table shards as close as possible. The function to complete this process is called the Union of tables DB2.

When accessing multiple tables used for connection or subquery, DB2 (R) Universal Database (DB2 UDB) can identify whether the data to be connected is located in the same database partition. Therefore, DB2 can execute connections or subqueries on the database partition where data is stored without moving data between database partitions. This ability to locally execute connections or subqueries has significant performance advantages.

To enable the DB2 union, the table must:

In the same database partition group, and this database partition group cannot be in the redistribution period. (During the redistribution period, tables in the database partition group may use different partition mappings-they are not collocated .)

The partition key contains the same number of columns.

The corresponding column of the partition key is partition compatible.

If a table is in a single-partition database partition group and the Partition Block is defined on the same partition where the other table is located, the Union can also occur.

Summary:

Data types that can be hashed, such as int, char, and varchar, are compatible with partitions. data types such as LOB and LF cannot be hashed.

For db2. for example, I have two tables A and B, where A contains column c1 int, c2 char, and B contains c1 varchar and c2 int.

Then Table A is partitioned by c1, and table B is partitioned by c2. That is to say, when you insert A row of data to Table. c1 is hashed, and then the number of partitions is modeled to insert new data into the corresponding partition. For Table B, the new data is for table B. c2 hash with the same algorithm.

In this way, when querying select * from c1, c2 where. c1 = B. in c2, because. c1 and B. c2 has the same data type and uses the same hash algorithm. If. c1 and B. c2 is equal, so they must be in the same partition. In this case, when performing the join operation, you do not need to move data in different intervals. Instead, you only need to return data to coord partition after the join operation in this partition.

The above content is an introduction to the parallel configuration of DB2. I hope you will have some gains.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.