Using kettle to connect dynamic sub-Libraries

Source: Internet
Author: User
Tags create database
I. Questions raised
In a data warehouse application, create a new MySQL database every day, named after the day, such as d_p20161201, d_p20161202, and use kettle to connect these databases to do data cleaning and ETL work. Because the database is dynamically generated by the script every day, kettle how to connect to the dynamic library.

Ii. Solutions
1. Establish a database connection and introduce variables into the database name. You cannot connect to the database at this point.
2. Set up the conversion and use the JavaScript step to set the variable referenced in the previous step as the date part of the database name.
3. Establish a job, after the beginning of the first call to establish the conversion, subsequent transformations or jobs can normally use the 1th step of the database connection established.

third, verification steps
1. Set up test database table.
Create database if not exists d_20161225;
Use d_20161225;
drop table if exists T1;
CREATE TABLE T1 (a int);
INSERT into T1 values (a), (102);
Commit

2. Create a new transformation, save as SET_DBNAME.KTR 3. Establish a database connection as follows:

As shown in the figure, a variable ${current_date} is referenced in the database name, and the variable is not defined at this time. If the test database connection will report the following error:


4. Set MyDB to be shared.

5. Edit Set_dbname conversion is as follows:

The conversion consists of three steps, which are shown in the following three diagrams:


Generate record generates a record that stores variable values in the data stream.


"JavaScript" is used to generate strings that conform to date format.


Set variable to assign a value to a variable.

6. Create a new transformation as follows, save as TABLE_OUTPUT.KTR.

This conversion has only two steps to test the database output, as shown in the following two graphs:


Table input queries the data in the T1 table.


"Text file output" stores the table data in a TXT file.

7. Create a new job, call the previous set of two transformations, save as CONNECT_DB.KJB.


8. Execution of Homework

9. View the contents of the output file, as shown in the following illustration:

Correctly query out table data.

10. Test the MyDB database connection at this time, you can succeed.

Four, summary
    This experiment was attempted at the following two points:
1. Using JavaScript steps to assign values to variables, this way of programming in kettle, Can implement very complex application logic.
2. Database connections can dynamically reference variables at runtime, which provides a possibility for implementing a unified ETL schedule.
    It is a common method to set variables and assign values first, and then use variables in subsequent steps or job items. The function of kettle is greatly enhanced by the program design in Kettle.

Reference: http://stackoverflow.com/questions/23491072/ Pass-db-connection-parameters-to-a-kettle-a-k-a-pdi-table-input-step-dynamically

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.