Kettle MongoDB Data Synchronization

Source: Internet
Author: User
Tags bulk insert modifier

Demand:

1. A new record is added to the source database, and a new record is added to the target repository;

2. The source database modifies a record, and the target library modifies the record at the same time;

Example uses three kettle components

The following details the configuration of each component

Source:

This example connects to a MongoDB database, four fields, the ID defaults to the primary key, and the _id automatically generates the system for the time being.

The detailed description of this node, can be seen on the official website: http://wiki.pentaho.com/display/EAI/MongoDB+Input

Value mappings:

This step is not very useful in this example, just to test the effect. Follow the on configuration to

Mongodboutput:

The key is the configuration of this step

This tab is explained on the official web page:

2.2 Selecting The Write mode

The MongoDb output step provides a number of options which control what and how data is written to the target Mongo docu ment collection. By default, data was inserted into the target collection. If the specified collection doesn ' t exist, it'll be created before data is inserted. Selecting the Truncate option would delete any existing data in the target collection before inserting beg Ins. Unless unique indexes is being used (see sections on indexing below) then Mongo DB would allow duplicate records to be Inserted. Mongo DB allows for fast BULK INSERT operations-the batch size can be configured using the batch insert size field. If No value is supplied here, then the default size of $ used.

Selecting the Upsert option changes the write mode from insert to Upsert (i.e. update if a match is found, otherw Ise insert a new record). Information on defining how records is matched can be found in the next section. Standard Upsert replaces a matched record with an entire new record based on all the incoming fields specified in the Mongo Document Fields tab. Modifier Update enables Modifier ($ operators) to being used to mutate individual fields within matching documents. This type of update is fast and involves minimal network traffic; It also have the ability to update all matching documents, rather than just the first, if the multi-update option is enabled

Personal understanding is to tick the red-circled option, the source data modified, added, in the target library will have a corresponding operation. But also set the next step

ID primary key match field for update be sure to select Y otherwise the run-time error occurs.

The most important part of the synchronization process is the set of steps listed above, of course, if you want to set up more powerful features, can be detailed to study the official website API

Official website API Address: http://wiki.pentaho.com/display/EAI/

Sample Kri File: Http://files.cnblogs.com/nyzhai/mongodbTran.rar

Kettle MongoDB Data Synchronization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.