Kettle uses timestamp to incrementally synchronize databases (I) bitsCN.com
The main idea of this experiment is that when creating a database table,
By adding an additional field, that is, the timestamp field,
For example, when synchronizing table tt1 and table tt2,
By checking that the table is updated, the table is used as the new table, and the old table is updated by the data in the new table.
The experiment data is as follows:
Mysql database 5.1
Test. tt1 (id int primary key, name varchar (50 ));
Mysql. tt2 (id int primary key, name varchar (50 ));
Snapshot Table, which can be stored in the test database,
You can also create a temporary table for ease of use.
Data kettle-1
Kettle-1
========================================================== ================================
Main process kettle-2
Kettle-2
In prepare, add the timestamp field to the tt1 and tt2 tables,
Because the databases where tt1 and tt2 are located are different, create two database connections respectively.
Prepare
Kettle-3
After the job is executed, the following fields are displayed during database query:
Kettle-4
Then, let's perform an insert operation and an update operation on the tt1 table ~
Kettle-5
In the original table, whether it is an insert or update operation, the corresponding updateTime will change.
If the updateTime field in table tt1 and table tt2 is the latest time, it indicates that the table is a new table.
As long as it corresponds to main_thread:
Kettle-6
The process in main_thread is as follows:
Create a snapshot table, and insert the maximum (latest) timestamp value in table tt1 and table tt2 to the snapshot table.
Then, a transformation is used to judge which table has the latest updateTime value,
To select whether the tt1 table is used to update the tt2 or tt2 table to update the tt1 table;
Main_thread.create_tempTable.JOB:
Main_thread.insert_tempTable.Job:
Main_thread.tt1_tt2_syn.Transformation:
First, create a connection to the temp table of the test database,
Select the record with the latest lastTime value in the temp table
The corresponding id number.
First, sort the lastTime field in temp in descending order,
Select the id and set the selected record to only one row.
Then, select a switch based on the value of id.
Here LZ is very keen to use, SQL Executor,
However, it cannot return the corresponding id value.
However, the corresponding id value can be returned for table input,
And received by the switch.
When merging records and synchronizing records, the two situations are the same.
In this case, the experiment environment has been set up,
Next, we will test the experiment data and write it to the next blog.
Of course, triggers are also a good method for synchronization. please write them to your blog later ~
Compared with triggers, the timestamp method is simple and common,
However, the timestamp field in the database table has no other purposes except for this purpose,
It consumes a lot of memory space.
BitsCN.com