標籤:des blog http io ar strong for 檔案 資料
需求:
1.來源資料庫新增一條記錄,目標庫同時新增一條記錄;
2.來源資料庫修改一條記錄,目標庫同時修改該條記錄;
樣本用到三個Kettle組件
下面詳細說下每個組件的配置
Source:
本樣本串連的是Mongodb資料庫,四個欄位,ID預設為主鍵,_id會系統自動產生暫時先不管。
本節點的詳細說明,可見官網:http://wiki.pentaho.com/display/EAI/MongoDB+Input
值對應:
本步在本樣本作用不大,只是為了測試效果。按照上進行配置即可
MongoDbOutput:
關鍵是這一步的配置
官網上對這個tab頁的解釋是這樣的:
2.2 Selecting the write mode
The MongoDb output step provides a number of options that control what and how data is written to the target Mongo document collection. By default, data is inserted into the target collection. If the specified collection doesn‘t exist, it will be created before data is inserted. Selecting the Truncate option will delete any existing data in the target collection before inserting begins. Unless unique indexes are being used (see section on indexing below) then Mongo DB will allow duplicate records to be inserted. Mongo DB allows for fast bulk insert operations - the batch size can be configured using the Batch insert size field. If no value is supplied here, then the default size of 100 rows is used.
Selecting the Upsert option changes the write mode from insert to upsert (i.e. update if a match is found, otherwise insert a new record). Information on defining how records are matched can be found in the next section. Standard upsert replaces a matched record with an entire new record based on all the incoming fields specified in the Mongo document fields tab. Modifier update enables modifier ($ operators) to be used to mutate individual fields within matching documents. This type of update is fast and involves minimal network traffic; it also has the ability to update all matching documents, rather than just the first, if the Multi-update option is enabled
個人理解就是勾選上紅色圈著的選項之後,來源資料修改、添加了,在目標庫裡都會有相應的操作。不過還要設定下面的一步
ID為主鍵match field for update時一定要選擇Y否則運行時出錯。
同步過程最主要的就是上邊列出的幾步設定,當然如果想要再設定更強大的功能,可詳細去研究官網的API
官網API地址:http://wiki.pentaho.com/display/EAI/
樣本kri檔案:http://files.cnblogs.com/nyzhai/mongodbTran.rar
kettle之mongodb資料同步