Correct understanding of DB2 partition compatibility and description of the concept of collocated

Source: Internet
Author: User

This article mainly describes the concept of DB2 partition compatibility and coexistence. In practice, you can compare the basic data types of the corresponding columns of the partition key, you can declare them as partition compatible ).

Partition-compatible data types have the following attributes: two variables with the same value but different types are mapped to the same partition number according to the same partition algorithm.

Recently I have been reading some DB2 DPF things. I have two concepts that I don't understand.

This is the concept of "DB2 partition compatibility" and "Parallel.

I found some related materials, including the following:

Partition compatibility

You can compare the basic data types of the corresponding columns of the partition key, and declare them as partition compatible ). Partition-compatible data types have the following attributes: two variables with the same value but different types are mapped to the same partition number according to the same partition algorithm. DB2 partition compatibility has the following features:

The basic data type is compatible with another similar basic data type.

The internal format is used for DATE, TIME, and TIMESTAMP data types. They are not compatible with each other and are not compatible with CHAR.

Partition compatibility is NOT affected by columns defined with not null or for bit data.

Processing of NULL values of compatible data types is completely the same; processing of NULL values of incompatible data types may be different.

The basic data types of user-defined types are used to analyze partition compatibility.

The processing of decimal places with the same value in the partition key is exactly the same, even if their scale and precision are different.

Spaces at the end of a string (CHAR, varchar graphic, or VARGRAPHIC) are ignored by the hash algorithm.

BIGINT, SMALLINT, and INTEGER are compatible data types.

REAL and FLOAT are compatible data types.

CHAR and VARCHAR of different lengths are compatible data types.

GRAPHIC and VARGRAPHIC are compatible data types.

DB2 partition compatibility is not applicable to long varchar, long vargraphic, CLOB, DBCLOB, and BLOB data types because they cannot be used as partition keys.

Collocated

Collocation is used to place rows in different tables in the same database partition that contain relevant data. The parallel table enables DB2 to use the connection policy more effectively.

You may find that as a response to a specific query, two or more tables provide data frequently. In this case, you want the data in such a table to be as close as possible. In an environment where a database is physically divided into two or more database partitions, there must be a way to keep the table shards as close as possible. The function to complete this process is called table concatenation.

When accessing multiple tables used for connection or subquery, DB2 (R) Universal Database (DB2 UDB) can identify whether the data to be connected is located in the same database partition. Therefore, DB2 can execute connections or subqueries on the database partition where data is stored without moving data between database partitions. This ability to locally execute connections or subqueries has significant performance advantages.

To be collocated, the table must:

In the same database partition group, and this database partition group cannot be in the redistribution period. (During the redistribution period, tables in the database partition group may use different partition mappings-they are not collocated .)

The partition key contains the same number of columns.

The corresponding column of the partition key is partition compatible.

If a table is in a single-partition database partition group and the Partition Block is defined on the same partition where the other table is located, the Union can also occur.

Summary:

For example, data types such as int, char, and varchar that can be hashed are compatible with DB2 partitions. data types such as LOB and LF cannot be hashed.

For the Union, for example, I have two tables A and B, where A contains column c1 int, c2 char, and B contains c1 varchar and c2 int.

Then Table A is partitioned by c1, and table B is partitioned by c2. That is to say, when you insert A row of data to Table. c1 is hashed, and then the number of partitions is modeled to insert new data into the corresponding partition. For Table B, the new data is for table B. c2 hash with the same algorithm.

In this way, when querying select * from c1, c2 where. c1 = B. in c2, because. c1 and B. c2 has the same data type and uses the same hash algorithm. If. c1 and B. c2 is equal, so they must be in the same partition. In this case, when performing the join operation, you do not need to move data in different intervals. Instead, you only need to return data to coord partition after the join operation in this partition.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.