Data migration combat: kettle-based MySQL to DB2 data migration

Source: Internet
Author: User
Tags db2 uuid windows 7 x64

from:https://my.oschina.net/simpleton/blog/525675

First, what is ETL

ETL, an abbreviation of English extract-transform-load, is used to describe the process of extracting data from the source (Extract), converting (Transform), loading (load) to the destination. The term ETL is more commonly used in data warehouse, but its object is not limited to data warehouse.

Second, kettle simple explanation

Kettle is a foreign open source ETL tool, written in Java, can be run on Windows, Linux, UNIX, data extraction is efficient and stable.

Third, the demand

1, Mysql 5.X Data migration to DB2 8.X

2, the primary key self-increment changed to 32-bit UUID

Iii. preparatory work

1. Source database: Mysql 5.6

2. Target database: DB2 8.X

3. Data Migration Tool: kettle6.0.0.0-353

Iv. start of actual combat

1, explain the omitted steps

Before I started using kettle, I used PowerDesigner18.6 to build the data model, and then I exported the database-building script, where I exported the DB2 8.X script, and first set up the target table in DB2. In fact, in the kettle can also be in the implementation of the process to establish a table, I am also a beginner kettle, so chose a relatively simple way to operate, after all, the focus is on data migration.

2. Installing the JDK

Because Kettle is pure Java write, so rely on the JDK, as for the installation of JDK is not the focus of this article, please the reader to complete their own Baidu.

3, Installation Kettle

Download Pdi-ce-6.0.0.0-353.zip, I have uploaded to Baidu Cloud,: http://pan.baidu.com/s/1jGjMoXw

Kettle is green install-free, unzip the downloaded compressed file, find the program startup file Spoon.bat (Linux spoon.sh), double-click to start. However, before this step, there is an important step to do is to copy our database driver jar package to the specified directory. I am windows 7 x64, so I copied the corresponding driver package MySQL and DB2 to the Data-integration\libswt\win64 directory, because after the copy in the need to restart the program to take effect, so the recommended decompression, First, copy the drive jar package to the specified directory.

Then click Spoon.bat to launch the application.


After the program is started, go to the Welcome page, there is a tree on the left side of the page, under "Main objects" are "conversion" and "job" two function nodes, as well as "core objects" menu, this is used more, will be discussed later.

At this point, the installation of Kettle has been completed, the following to enter the important part of the actual combat.

3. Start using Kettle for data migration

Here I do not speak too much theoretical knowledge, after all, I am also a preliminary study of kettle, so this article is focused on how to operate kettle to complete a data migration.

Step One: Create a conversion (transformation)

Double-click Spoon.bat to start Kettle, click "File" – "New" – "Convert" in the upper-left corner of the interface and save as DEMO.KTR.

Step two: Add "Table Input"

Find "Input – Table input" under "core objects" on the left, and drag it into the edit area on the right.

Step three: Edit "Table Input"

Double-click "Table Input" in the edit area to enter the editing interface.

In the pop-up window, you can edit the information and click "New"

In the pop-up window, enter the source database (MySQL) server information, click the Test button, test the connection is successful.


Note: If you do not copy the jar package from the database to the specified file directory, it will throw an exception saying that the driver could not be found. Just copy the jar package to the specified file (depending on the system, see the installation section of Kettle), and then restart the program to resolve the issue.

Test the connection successfully, click "OK", then select our new connection at "Database link".

Click on "Get SQL query statement"

Select the data table you want to migrate in the pop-up window, and you can only manipulate one table at a time

Select "Yes"

Click "OK" and the source data is configured.

Fourth step: Add and edit field selection (optional)

If your data type changes during the migration process, you need to add the field selection step. For example, in the migration process, I need to change the self-augmentation of the Shaping field to 32-bit UUID, as to why this demand, because I found in the migration process if the primary key is set in the target database, then the data migration process, cannot insert the primary key, only by the database to operate the self-increment, so, Will cause my original associated data to not complete the migration. From this lesson to get experience, after the database primary key design, will occur the main foreign key association, as far as possible to design as UUID.

Find "Field selection" under "Core Objects – transformations" and drag it into the edit area on the right.

Hold down SHIFT and drag a line from table input to field selection

Double-click "Field Selection" to enter the editing interface

Select "Metadata"

Click on "Get Changed fields"



In this place, you can modify the source data table fields to match the target data tables. I need to change the original primary foreign key to the UUID, because I havedefined theprimary foreign key as VARCHAR (+) in advance in the Target data table (DB2), So here I need to modify the primary foreign key of the source data table to be String, without tube length.

Click OK

Fifth step: Add "Table Output"

Under Core objects – output, select Table output and drag to the right edit area

Press and hold shift from field select to table output to drag an arrow and select Main output step (the option appears when the mouse clicks table output during the drag line).

Double-click Table Output, select New

Similar to the previous, enter the database address information in the pop-up window, click Test

Test connection Successful, click OK, select the target database connection for the new DB2 in the database link

Select "Target Mode" (here I don't quite understand, in short DB2 words, appear is schema), then select "Target Table", click OK

Tick "Specify Database Fields" and click "Input Field mappings"

In the pop-up window, select the Source Data table field and the Target data table field matching method, click "Guess" can be a key assignment, but its allocation rules I do not understand, may be based on the similarity of the name, if the name is the same or similar, can be used, very convenient.

Click OK.

Sixth step: "Verify this conversion"

Click "Check this conversion" to verify that the conversion we have just defined is problematic.

If there is no problem, it means that the transformation we have defined can be executed

Seventh step: "Run this transition"

Click "Run this transition", then start to perform data migration, you can see the execution log under the window, if there is no error, then we completed a table of data migration work.

Bulk Operations

If there are more tables, you can define more than one of the "table input"-"field select"-"Table output" steps shown in this conversion, one batch execution. However, such execution does not guarantee the order of execution, and if there is a primary foreign key association, then the bulk execution will be problematic. So you can have the primary foreign key associated with the data migration of the table into a number of "transformation", and then through the job to string them together, so that the execution order can be guaranteed, as for the establishment of the work, this article no longer, the future time to open an article alone, readers can also find information to see, simple operation, It will be at first sight.

I'll show you the approximate homework, I won't write it in detail.

Data migration combat: kettle-based MySQL to DB2 data migration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.