Database Partitioning Overview
When the data in the database table increases, the query speed of the data will inevitably decrease, and the performance of the application will decrease. In this case, we should consider partitioning the table, data is stored in different partitions so that data reading can be processed in parallel, greatly improving the data access speed. It should also be noted that after the table is partitioned, the logical table is still complete, but the data in the table is physically stored in different tablespaces (physical files ), in this way, the entire table is not scanned every time during data query.
In general, partition can greatly improve the database performance, and partition management can also improve the flexibility and availability of Administrator data management, such as a partition failure, data in other partitions is still available, and the Administrator only needs to repair the partition. At the same time, you can map different partitions to the disk to balance IO.
Partitioning diagram of Oracle and DB2 1. Architecture of different sharing Modes
To improve data read/write performance, both oracle and db2 adopt parallel methods in the system architecture. The parallel mode can be divided into the following types:
- Share everything
This is the simplest way, that is, executing multiple processes on the same machine in parallel. The processor shares the memory buffer pool and physical storage during processing. This method is subject to physical resources of machines.
- Share disk
In this architecture, all machines share a physical storage. Because each physical machine is independent of the data in the database, this guarantees the maximum number of machines. Similarly, because each machine scans all the tables on the disk, the overall performance is subject to large systems.
- Share memory
That is, all machines share the memory.
- Share nothing
Data is distributed across all machines and each machine has its own storage. Because of this, the increase or decrease of machines will also affect the entire system. At the same time, because each machine can only access its own physical storage, this ensures that the system performance can be almost linearly improved.
Oracle partition Introduction
Oracle uses the share-disk architecture. Therefore, the partition feature of oracle is only on the same disk, which determines that it does not have the DPF feature and only has table partitions. The following describes several major table partitions in oracle.
- Range partition)
Range partition is a common partitioning method that maps data to a partition based on the partition key. A partition key is a column in a table, and a date is a commonly used partition key. Range partition has the following properties:
- Use the values less than clause to specify the upper limit. If the value is greater than or equal to this value, data is allocated to the next partition. Of course, the lower limit of a partition is the upper limit of the previous partition.
- Define MAXVALUE in the highest score area. It represents a value higher than all other partitions, including control.
- List partition (list partition)
List partition can specify the specific values in the column for partitioning. This partition method is not applicable to Range and Hash. This method is somewhat like: users can organize and group data sets with no sequence or relationship. Users can clearly control the number of rows to be partitioned.
- Hash partition (hash partition)
Hash Partitioning maps data to a partition based on the HASH algorithm. The HASH algorithm applies the partition keywords you specify to evenly divide the rows in Partitions. Each partition is approximately the same size. To ensure that data can be evenly distributed, the number of partitions is generally 2N. For example, if you need to insert sales_hash a piece of data, ORACLE will use the HASH algorithm to process salesman_id and then find the partition table to be inserted. Hash Partitioning is an ideal method for distributed data across devices. The HASH algorithm is also easy to convert to the RANGE Partitioning method, especially when the partitioned data is not historical data.
- Composite Partition
Traditional composite partitions include range-list partition and range-Hash partition. Range-list partition is a combination of range and list technology. In essence, table data is partitioned by range first, then, use the list to partition each range again. The content of each sub-partition represents the logical subset of data. Range-Hash partitions are a combination of the range and Hash technology. The essence is to partition the table intervals first, and then separate the data hash in several tablespaces.
RANGE-RANGE, LIST-RANG, LIST-HASH, and LIST-LIST are added to Oralce11G. However, compound partitions are not considered in this article.
DB2 partition Introduction
In DB2, partitions can be divided into the following aspects:
- Database partition
That is, the database partition feature (DPF), as shown in figure 2.
- Table Partitioning
Table partitions in db2 are also called data partitions or interval partitions. Similar to oracle, this data organization mode stores data in different storage objects, the storage objects are data partitions or data ranges divided by values of one or more columns. Each Data Partition is stored in a different tablespace.
- MDC
Multi-dimensional data sets (MDC) are introduced when DB2 V8 aggregates similar data in multiple dimensions and stores them on disks, in this way, the data can be read in the whole block, thus improving the I/O efficiency. For example, for all rows with Product = car and Region = East, you can use these two fields as dimensions to store the data with the same fields in the same location, that is, the so-called block. The size of a block is equal to the size of the extended data block (extent) of the tablespace. The extended data block is a set of consecutive pages on the disk, therefore, rows with similar values are physically stored on consecutive data pages. For more information about MDC, see the "multi-dimensional cluster Table" in the DB2 9.7 Information Center ".
Figure 2. DB2 DPF architecture comparison between Oracle partitions and DB2 partitions
- Oracle partition range
The Range partition of Oracle is very similar to the table partition of DB2 in terms of concept, but it is different in the definition of table partitions. Therefore, the range partitions in Oracle are mapped to the DB2 table partitions. During data migration, you can directly migrate the range partitions in Oracle to the DB2 table partitions. An example of interval Partitioning in Oracle is shown in Listing 1:
Listing 1. Example of interval partitions in Oracle CREATE TABLE sales ( year int, month int ) PARTITION BY RANGE (year, month) (PARTITION p1 VALUES LESS THAN (2002,4) tablespace tbsp1, PARTITION p2 VALUES LESS THAN (2002,7) tablespace tbsp2, PARTITION p3 VALUES LESS THAN (2002,10) tablespace tbsp3, PARTITION p4 VALUES LESS THAN (2002,13) tablespace tbsp4, PARTITION p5 VALUES LESS THAN (2003,4) tablespace tbsp5, PARTITION p6 VALUES LESS THAN (2003,7) tablespace tbsp6, PARTITION p7 VALUES LESS THAN (2003,10) tablespace tbsp7, PARTITION p8 VALUES LESS THAN (MAXVALUE, MAXVALUE) tablespace tbsp8 );
The table partition definition in the corresponding db2:
List 2. Table partition example in DB2 CREATE TABLE sales ( year INT, month INT ) PARTITION BY RANGE (year, month) (STARTING FROM (2001, 1) ENDING (2001,3) IN tbsp1, ENDING (2001,6) IN tbsp2, ENDING (2001,9) IN tbsp3, ENDING (2001,12) IN tbsp4, ENDING (2002,3) IN tbsp5, ENDING (2002,6) IN tbsp6, ENDING (2002,9) IN tbsp7, ENDING AT MAXVALUE );
- List partitions in Oracle
List partitions in Oracle can be directly mapped to table partitions with generated columns in DB2. The definition of list partitions in Oracle has been discussed earlier, the definition of table partitions in DB2 provides a method that can contain generated columns. In this way, table partitions with generated columns are similar to the logical structure of list partitions in Oracle.
Listing 3. List partition example in Oracle CREATE TABLE customer ( cust_id int, cust_prov varchar2(2) Chapter 2. Language compatibility features 127 ) PARTITION BY LIST (cust_prov) (PARTITION p1 VALUES ('AB', 'MB') tablespace tbsp_ab, PARTITION p2 VALUES ('BC') tablespace tbsp_bc, PARTITION p3 VALUES ('SA') tablespace tbsp_mb, … . PARTITION p13 VALUES ('YT') tablespace tbsp_yt, PARTITION p14 VALUES(DEFAULT) tablespace tbsp_remainder );
An example of the table partition with generated columns of DB2 corresponding to this table is as follows:
Listing 4. Table partition example with generated columns in DB2 CREATE TABLE customer ( cust_id INT, cust_prov CHAR(2), cust_prov_gen GENERATED ALWAYS AS (CASE WHEN cust_prov = 'AB' THEN 1 WHEN cust_prov = 'BC' THEN 2 WHEN cust_prov = 'MB' THEN 1 WHEN cust_prov = 'SA' THEN 3 ... WHEN cust_prov = 'YT' THEN 13 ELSE 14 END) ) IN tbsp_ab, tbsp_bc, tbsp_mb, .... tbsp_remainder PARTITION BY RANGE (cust_prov_gen) (STARTING 1 ENDING 14 EVERY 1);
In addition, you can map the List partition of oracle to the MDC of db2.
- Hash partition of Oracle
The definition of hash partition in Oracle is described in Section 1. Therefore, only the definition of DPF is used in DB2. Although Oracle does not have the concept of DPF physically, however, data can be logically allocated to the corresponding partition according to the hash algorithm. Therefore, in data migration, Oracle hash partitions can be mapped to the DPF distributed by hash algorithm in DB2. In addition, it is more efficient to execute hash partitions in Oracle.
Example of hash partition in Oracle
Listing 5. Hash partition example in Oracle CREATE TABLE hash_table ( hash_part date, hash_data varchar2(20) ) PARTITION BY HASH(hash_part) (partition p1 tablespace tbsp1, partition p2 tablespace tbsp2 );
An example of DPF ing to DB2 is as follows:
Listing 6. Table partition example with generated columns in DB2 CREATE TABLE partition_table (partition_date date NOT NULL, partition_data VARCHAR(20) NOT NULL ) IN tbsp_parts DISTRIBUTE BY HASH (partition_date);
To sum up, any single partition in Oracle can be implemented in DB2. In contrast, DB2's unique multi-dimensional cluster cannot be implemented in Oracle,
Figure 3. Oracle-DB2 partition ing simulate Oracle composite partitions in DB2
The oracle composite partition has been introduced earlier. Here we only describe the two in detail (Range-list and Range-hash ), this article also describes how to map the two composite partitions in DB2.
- Range-list
Range-list: This composite partition is the partitioning method combining Range partition and list partition. We have introduced the implementation methods of oracle Range partitions and List partitions in DB2, therefore, this composite partition corresponds to DB2, and we only need to use the preceding non-composite partition method comprehensively.
For example, there is a table store in the ORACLE database. The table uses the sales_date field as the range partition and the store_distrct field as the list partition. The table is defined as follows:
Listing 7. Range-list Example in Oracle create table store( store_id varchar2(10), store_district varchar2(20), sales_date date, revenue number(10) ) partition by range(sales_date) subpartition by list(store_distrct) subpartition template( subpartition subpar1 values ('haidian') tablespace tbsp1, subpartition subpar2 values ('chaoyang') tablespace tbsp1, subpartition subpar3 values ('dongcheng') tablespace tbsp1, subpartition subpar4 values ('xicheng') tablespace tbsp1, subpartition subpar5 values ('fengtai') tablespace tbsp1, subpartition subpar1 values ('shijingshan') tablespace tbsp1) ( partition par1 vales less than (to_date('01/04/2012','DD/MM/YYYY')), partition par2 vales less than (to_date('01/07/2012','DD/MM/YYYY')), partition par3 vales less than (to_date('01/10/2012','DD/MM/YYYY')), partition par4 vales less than (MAXVALUE) );
How can we map the range-list table of oracle to db2? We have explained that Range partition can be mapped to table partition of db2. list partition can be mapped to table partition with generated columns. However, db2 does not have the concept of a composite partition, so we can consider ing to db2 with table partition and MDC. The corresponding table definition in db2 is as follows:
Listing 8. table-MDC example in Db2 create table store( store_id varchar(10), store_distric varchar(20), sales_date date, revenue number(10) ) in tbsp1, tbsp2, tbsp3, tbsp4 partition by range(sales_date) (starting from ('01/01/2012') ending ('31/12/2012') every (3 month)) organize by dimension(store_district);
Like oracle, The sales_date field is used as the range partition field and the store_district field is used as the MDC.
- Range-hash
This partitioning method is a combination of Range partition and hash partition. It is to first set a field to range partition and then subpartition according to hash partition. Considering the hash method used by DB2 In the partitioned database (DPF), we have used DPF and table partitions to simulate this composite partition in DB2: we have created a partition group, this partition group contains multiple database partitions, create a tablespace under this partition group, and then create a table under this tablespace. In this way, we can ensure that the data in this table will be distributed to all the included partitions. At the same time, we use the Range Table partition of DB2, so that within any database partition, data can also be partitioned by range.
Listing 9. Range-hash example in Oracle Create table car( Car_id varchar2(10), Sales_date date, Sales_const number(10) ) Partition by range(sales_date) Subpartition by hash(car_id) (partition s1 values less than (to_date( ‘ 01/04/2012 ’ , ’ DD/MM/YYYY ’ )), partition s2 values less than (to_date( ‘ 01/07/2012 ’ , ’ DD/MM/YYYY ’ )), partition s3 values less than (to_date( ‘ 01/10/2012 ’ , ’ DD/MM/YYYY ’ )), partition s4 values less than (MAXVALUE))
In DB2, we need to first create DPF and then create table partitions on this DPF. Here we will not detail the creation process of DPF. The table definitions corresponding to the above table in DB2 are as follows:
Listing 10. Range-DPF example in DB2 create table car( car_id varchar(10), sales_date date, sales_cost number) in tbsp1, tbsp2, tbsp3, tbsp4, tbsp5 distribute by hash(car_id) partition by range(sales_date) (starting from ('01/01/2012') ending ('31/12/2012') every (3 month));
Data migration from Oracle Partition table to db2
The database migration work is quite complex. This article mainly introduces how to migrate the partition table in oracle to db2. The corresponding partition tion ing between oracle and db2 has been described in the previous section, next we will take the range partitionhe and hash partition tables in oracle for migration to db2 as an example to introduce how to migrate an oracle partition table to db2, the main process for migrating oracle Data to db2 is not described here. In addition to table ing, you also need to deploy table objects, extract data, and deploy data to the partition table corresponding to db2. We can barely create a table partition Table object manually, but it is difficult to extract and deploy data. Therefore, we can use some auxiliary tools developed by IBM, for example, Migration Toolkit (MTK) and Data Movement Tool.
Range partition of oracle to table partition of DB2
Next we assume that there is a range partition table in the Oracle database. The definition of this table is as follows. Next we use this table as a case to migrate this table to the table partition of DB2.
Listing 11. Range partition table definition in Oracle
CREATE TABLE usertest.sales ( year int, month int ) PARTITION BY RANGE (year, month) (PARTITION p1 VALUES LESS THAN (2009,4) tablespace tbsp1, PARTITION p2 VALUES LESS THAN (2009,7) tablespace tbsp2, PARTITION p3 VALUES LESS THAN (2009,10) tablespace tbsp3, PARTITION p4 VALUES LESS THAN (2010,1) tablespace tbsp4, );
Here, usertest is the schema of the table. Four partitions are created in four tablespaces, tbsp1, tbsp2, tbsp3, and tbsp4. the dataset is queried using the select statement.
Figure 4. Data Query Result set of the Range Partition Table in Oracle
Next, we need to migrate the range partition table to db2. First, we need to create a table partition table corresponding to it in db2. Values smaller than the current date in oracle are not included in the corresponding partition, as in the preceding SQL statement, which is different from that in db2. The corresponding table creation statement in db2 is as follows:
Listing 12. Table partition definition in DB2
CREATE TABLE "USERTEST"."SALES" ( "YEAR" DECFLOAT(16) , "MONTH" DECFLOAT(16) ) PARTITION BY RANGE ("YEAR","MONTH") ( PARTITION P1 STARTING (MINVALUE,MINVALUE) INCLUSIVE ENDING (2009, 4) INCLUSIVE ,PARTITION P2 ENDING (2009, 7) INCLUSIVE ,PARTITION P3 ENDING (2009, 10) INCLUSIVE ,PARTITION P4 ENDING (2010, 1) INCLUSIVE )
In DB2, we also define four partitions P1, P2, P3, and P4. During data migration, if we use import and export for data migration, compared with the generated data file, we will find that the data formats of oracle and db2 are completely different. Therefore, it is unrealistic to use import and export for tables with a large data size, in this case, we can use IDMT to export and import table definitions and data. Here we use IDMT for data export and import. The specific instructions for use refer to IDMT instructions http://www.ibm.com/?works/data/library/techarticle/dm-0906datamovement.
After the data is imported, use datastudio to query the data. Based on the comparison of the query result set, we can see that the data has been completely migrated.
Figure 5: data query result set of Table Partition Table in DB2 Hash partition from Oracle to DPF in DB2
Suppose there is a Hash Partition table in an Oracle database. The definition is as follows:
Listing 13. Definition of Hash table in Oracle
CREATE TABLE hash_sales ( salesdate date, protype varchar2(20), proname varchar2(20) ) PARTITION BY HASH(salesdate) ( partition p0 tablespace tbhash, partition p1 tablespace tbhash );
This table uses the hash_part field to partition according to the HASH algorithm. Two partitions p1 and p2 are created in the tablespace tbsp1 and tbsp2 respectively. Use the select statement to query the data result set:
Figure 6. Data Query Result set of Hash Partition Table in Oracle
As mentioned above, we can map the table to the DPF of DB2. If DPF has been created for the current DB, I can directly create a table. However, for migration, you often need to create your own logical partitions, follow the steps below to create DPF, Table space, and Table.
- Add partition with db2start
We can use db2start to create a partition. The command format is as follows:
Db2start dbpartitionnum PARTITIONNUM add dbpartitionnum hostname <HOSTNAME> port <PORTNUM> without tablespaces
Here we already have a partition 0, and we will create another partition 1.
Db2start dbpartitionnum 1 add dbpartitionnum n4shost port 1 withoust tablespaces
- Create a partition group
After the partition is added, you need to create a partition group for the partition. The command format is as follows:
Create database partition group <group name> on dbpartitionnums (<num>, <num>... )
The group we created here is grouphash.
Create database partition group grouphash on dbpartitionnums (0, 1)
- Create tablespace
Create a tablespace in the partition group. The command format is as follows:
Create tablespace <Tablespace name> in database partition group <partition group name>
Here we create a tablespace tphash:
Create tablespace tphash in database partition group grouphash
- Create table
Create a table corresponding to oracle in tbptest. The SQL statement is as follows:
Listing 14. DPF table mapped to in DB2 create table hash_sales( salesdate date, protype varchar(20), proname varchar(20) ) in tphash distribute by hash(salesdate)
After the table is created, data is exported and imported. Similar to the previous section, we can use the IDMT tool. However, there is very little data in this experiment, so here we use import directly to import data. Shows the data query results by using clp after the data is imported.
Figure 7. Summary of the DPF Table query result set mapped in db2
This article gives an overview of the partition modes of Oracle and DB2 respectively, and compares the two data partition modes, this article introduces how to map simple partitions in Oracle in DB2. For Oracle composite partitions, this article also introduces Range-list and Range-hash and proposes a ing scheme in DB2. We also briefly describe how to migrate the range partition and hash partition tables in oracle to db2, and provide corresponding methods.
In short, this article provides an overall ing solution for oracle's partition migration to db2 and provides some simple data migration methods. It still needs to be improved for complex composite partitions.