Implementing dynamic SQL queries in kettle

Source: Internet
Author: User

implementing dynamic SQL queries in Kettle

In ETL projects, it is common to execute some SQL statements, such as querying data, based on runtime input parameters. This article describes dynamic queries and parameter queries through the table input ("table input") step in kettle. The sample code uses the in-memory database (H2), which can be run directly with the download, and is easier to learn by example.

Placeholder bound field value in SQL query statement

The first approach to dynamic statements is familiar with the execution of SQL code, start writing an SQL query, contain some placeholders, and then bind the values to placeholders to make them a valid query and execute. You can bind multiple values and loop execution as needed. The name of this example is the Placeholders.ktr file.

In the example, first create the Presidents table and fill in the data (about the President of the United States), the code is as follows: Name, state, political party, occupation, college, date of appointment, date of departure.

CREATE TABLE Presidents (

name VARCHAR (255),

State VARCHAR(255),

Party VARCHAR(64),

Occupation VARCHAR(64),

College VARCHAR(64),

Took_office DATE,

Left_office DATE

);

The following query statement uses a question mark placeholder when the start date (first? Number) and end date (the second one?) Number) is bound to the SQL question mark placeholder, in the query entry date for a certain period of time for President information:

SELECT Name,took_office from Presidents WHEREtook_officebetween? and?


example, first use the build row step ("Generdate rows") to generate a row of records with two fields, sequentially in place of the placeholders in the SQL statement for the table. In a real-world scenario, you typically use dynamic processing results to produce expected values instead of generating row steps.

Next is the table input step, which configures the SQL query statement, which contains the question mark placeholder, to replace the value of the question mark by selecting the previous step in the drop-down box of Insert Data Step.

Execute a query multiple times by transferring different values

If you want to loop through the query and replace the placeholders with different values, you need the placeholder production steps to generate multiple rows of data and select the option "Execute for each row" that the table enters. This example filename is called PLACEHOLDERS_IN_LOOP.KTR.

Limitations of placeholders

While it is very effective to bind value queries to placeholders, there are some scenarios that cannot be used, and some of the following SQL cannot use placeholders. These examples are very common, but you cannot use placeholders.

You cannot replace a table noun with a placeholder, or the query will not execute.

SELECT Some_fieldfrom?

Instead of using a placeholder for the field name of the query, the following query can bind the parameter successfully, but only as a constant, not as the name of the field.

SELECT? Asmy_field from table

You cannot use placeholders to bind multiple list item values separated by commas, and if you bind "1,2,3″ to the following query statement, you will get unexpected results."

SELECT * from Testwhere ID in (?)

The result you expect to get is:

SELECT * from Testwhere ID in ("All-in-a-box")

But the result of the operation is this, the transmission of a string, but get three values, and the actual situation is completely uncertain how many values are transferred in.

SELECT * from Testwhere ID in (All-in-a-

In order to solve the problem of these scenarios, the query text needs to be constructed dynamically using kettle variables, which are explained in detail below.

Using kettle variables in SQL queries

The table input steps support the substitution of variables or parameters in the query, assuming there are a series of fully related tables, namely: mammals, birds, insects (animals, birds, insects), you can use the kettle variable as the name of the table. Suppose we have a variable named: animals_table, assigned to birds, we set the "Replace vaiables" option selected. If we write the following query:

SELECT name,population from ${animals_table}

In the execution must be successfully replaced by:

SELECT name,population from birds

If the value of the set variable is "mammals" or "insects", a different table will be queried dynamically. When placeholders are not competent, using variable techniques can help us solve them. The name of the example is Variables.ktr, and the runtime does not forget to assign a value to parameter (named parameter) for testing.

Variables and placeholders used together

If necessary, we can mix the two techniques; In this example, a variable is used as a table noun, and the placeholder is used as the input value for the preceding step. Sample file VARIABLES_AND_PLACEHOLDERS.KTR.

SELECT Name, population from ${animals_table}where population >?

Sample Download

You can download the sample files here . All examples are kettle5.1 moderators in the test pass, and test data is used H2 memory database, download can be run directly, very easy, I hope you learn smoothly.

Implementing dynamic SQL queries in kettle

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.