The following sections describe the concept of DB2 consortium and provide a correct understanding of DB2 consortium. We all know that collocation is used to place rows in different tables of the same database partition. The parallel table enables DB2 to use the connection policy more effectively.
You may find that as a response to a specific query, two or more tables provide data frequently. In this case, you want the data in such a table to be as close as possible. In an environment where a database is physically divided into two or more database partitions, there must be a way to keep the table shards as close as possible. The function to complete this process is called the Union of tables DB2.
When accessing multiple tables used for connection or subquery, DB2 (R) Universal Database (DB2 UDB) can identify whether the data to be connected is located in the same database partition. Therefore, DB2 can execute connections or subqueries on the database partition where data is stored without moving data between database partitions. This ability to locally execute connections or subqueries has significant performance advantages.
To enable the DB2 union, the table must:
In the same database partition group, and this database partition group cannot be in the redistribution period. (During the redistribution period, tables in the database partition group may use different partition mappings-they are not collocated .)
The partition key contains the same number of columns.
The corresponding column of the partition key is partition compatible.
If a table is in a single-partition database partition group and the Partition Block is defined on the same partition where the other table is located, the Union can also occur.
Summary:
Data types that can be hashed, such as int, char, and varchar, are compatible with partitions. data types such as LOB and LF cannot be hashed.
For db2. for example, I have two tables A and B, where A contains column c1 int, c2 char, and B contains c1 varchar and c2 int.
Then Table A is partitioned by c1, and table B is partitioned by c2. That is to say, when you insert A row of data to Table. c1 is hashed, and then the number of partitions is modeled to insert new data into the corresponding partition. For Table B, the new data is for table B. c2 hash with the same algorithm.
In this way, when querying select * from c1, c2 where. c1 = B. in c2, because. c1 and B. c2 has the same data type and uses the same hash algorithm. If. c1 and B. c2 is equal, so they must be in the same partition. In this case, when performing the join operation, you do not need to move data in different intervals. Instead, you only need to return data to coord partition after the join operation in this partition.
The above content is an introduction to the parallel configuration of DB2. I hope you will have some gains.