Kettle learning Summary (1)

Source: Internet
Author: User

Recently, due to the needs of the project, kettle was initially involved. Now I will sort out my experiences on using kettle to develop a job over the past two weeks and share it with you.

 

I. What is kettle?

Kettle is an ETL tool that is mainly used to manage data from different data sources and stream data in a certain way. It is the most commonly used scenario and data transmission between different systems, you can use kettle to create a conversion job. Currently, it is written in pure Java, so it has the best compatibility with Java.

Kettle consists of four parts: Spoon, pan, kitchen, and chef. This summary mainly involves spoon and kitchen, which are widely used. Among them: spoon is the core graphical processing interface. It completes the conversion of a series of data streams by dragging components and configuring components. Currently, kitchen mainly creates BAT files to batch process jobs in some columns, for example, scheduled tasks in windows.

 

Ii. Kettle script files

1. Transformation: complete basic data conversion.

2. Job: controls the entire workflow.

Iii. resource library configuration (based on version 4.4.0)

The resource library is mainly used to store the conversion and job written on the kettle tool.

There are two types of resource libraries:

Kettle database repository

Kettle file reposity

One is the data resource library: converts and jobs are stored in the corresponding tables in the data resource library. When you configure these tables, an SQL statement for creating tables appears, execute these SQL statements to create a table. Most of the data resources are created.

Another is the file resource library: the conversion and job written are stored in the file, which is not widely used.

The following describes the configuration of the MySQL resource database (the Oracle configuration is relatively simple and the configuration steps are basically the same, and the corresponding resource library table creation is a bug in MySQL)

1. click the button to go to the resource library configuration page.

2. Select the first database resource configuration and click OK. On the displayed page, select create resource database.

3. Configure the database as follows: Kettle itself comes with a jar package without data, so you need to manually place the jar package in the directory of the kettle Installation File (D: \ tools \ kettle \ data-integration \ Lib), and click test to test whether the connection is successful.

4. If the database connection is successful, click OK to create a database resource table.

5. Click Create or update here. A bounce box is displayed. The box contains the SQL statements used to create tables. We will not execute them here. copy and paste these SQL statements to the database processing tool, run directly in the database (MySQL statements used to create a table in kettle will first report an error, but it will not run directly in the database. In addition, we have found that this situation exists in MySQL, oracle does not)

6. log on to the resource database. The default username and password are admin and Admin.

7. Now the resource database configuration is complete.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.