Background
Use the Kettle tool to perform incremental work on a table.
Solution
Use the module to complete, the following diagram:
Experimental
Prepare an experimental environment by hand first:
Prepare to basically complete the following effects:
Select T.*, t.rowid from Emp_etl t
Select Max (hiredate) MAXSJ from Emp_etl
We verify that the data for October 22, 2015 is inserted into the target table.
To create a target table:
CREATE TABLE Emp_etl_1 as SELECT * from Emp_etl t where 1=2;
At this point the Emp_etl_1 data is empty and we only insert the data hiredate for 2015/10/22.
You can see that there are two data that meet this condition, as shown in:
Write the timestamp ktr conversion process, the largest hiredate in the original table as a variable, such as:
In the "Table input" link below, using the passed ${MAXSJ} as a condition, complete the insertion of data into the target table, the following brief process:
Perform the conversion, complete the data insertion, with the following diagram:
Add: This is just a simple example of how the method in this experiment, if it is full-table max each time, can be compromised when the amount of data is particularly large when the increments are done using timestamps. At this point, we need to add a timestamp record table, add a timestamp field, and after each extraction, the timestamp of the latest record after each extraction is recorded in the timestamp table, and then each time the start time is queried, it is only necessary to extract from the timestamp table.
Small knowledge, simple and remember.
The Blue Growth Kee series _20151022
Original works, from the "Blue Blog" blog, Welcome to reprint, please be sure to indicate the source (Http://blog.csdn.net/huangyanlong).
The growth of Blue-chase DBA (1): Rushing to the road and into Shandong
Blue Growth Kee-Chase DBA (2): Install! Installation! A long-lost memory that caused me to re-perceive the DBA
The growth of Blue-Chase DBA (3): antique operation, data import and export become a problem
The growth of blue-chasing the DBA (4): Recalling the youth's sorrow, and then exploring Oracle installation
The growth of Blue-Chase DBA (5): No talking about business, annoying application system
The growth of Blue-Chase DBA (6): Work and life: small skills, great people
The growth of Blue-Chase DBA (7): Basic Command, foundation stone
The growth of Blue-chase DBA (8): Regain SP report, recall Oracle's Statspack experiment
The growth of Blue-Chase DBA (9): National Day, chasing DBA, new plan, new departure
The growth of the Blue-Chase DBA (10): flying knives to defend themselves, not expertise: fiddling with middleware WebSphere
The growth of Blue-chase DBA (11): The ease of coming home, the dizzy wake up
The growth of Blue-chase DBA (12): Seven days seven harvested SQL
The growth of blue-chasing DBAs (13): Coordinating hardware vendors, six stories: what you see and feel "servers, storage, switches ..."
The growth of Blue-chase DBA (14): An unforgettable "cloud" side, starting Hadoop deployment
The growth of Blue-Chase DBA (15): Think FTP is very "simple", who chengxiang twists
The growth of Blue-Chase DBA: The DBA also drank and was 捭阖
The growth of blue-chasing DBAs (17): sharing, or spending, learning to grow in the post-IoE era
The growth of Blue-chase DBA (18): A cluster failure on a small machine, caused by an IP replacement
The growth of Blue-Chase DBA (19): The episode on the road: Touching "frame" and "software system"
The growth of Blue-chase DBA (20): Why, build a library escort
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Data cleaning Note (one): Kettle_ data increment using Set variable (small case)