ETL (Extract-transform-load, extract, transform, load), data warehousing technology, is used to process the data from the source (previously done projects) through the extraction, transformation, loading to reach the destination (the project is doing). That is, the new project needs to use the data from the previous project database, ETL is to solve this problem.
ETL to achieve common points of attention: correctness, integrity, consistency, completeness, effectiveness, timeliness, accessibility, that is, no matter what tools we use to achieve ETL technology, to achieve these aspects, is considered quality clearance.
Kettle is one of the tools, others: INFORMATICA,DATASTAGE,OWB, Microsoft's DTS and so on. OK, here's a brief talk about kettle.
Kettle is a foreign open source ETL tool, written in Java, can run on Windows,linux,unix, data extraction is efficient and stable. Kettle Chinese name is commonly known as "kettle", the purpose of the development is to put all kinds of data into a pot, and then through a variety of processing, in a specific format outflow.
The design of the kettle conversion includes several aspects: resource pool, database connection, Job (Job), conversion (trans), step. Image of an example: the Repository is equivalent to a Java project, the database connection is equivalent to our Java project in the connection database, the job is equivalent to a line in the Java project, and trans equivalent to a class in Java, step is the method in the class. So what kettle needs us to do is build a repository, connect the database, build the transformation, write each step in the class, and connect the transitions together to compose the task (the transformation can also be performed independently).
Finally see when it is appropriate to use kettle. Such a project A, this project a needs to be implemented to many enterprises, and project a needs to use each enterprise database basic data, such as employees, organizational structure, customers, suppliers and other basic data, this time kettle can easily complete the task. That is, our projects need to migrate data between a large number of databases.
Excerpt from: http://blog.csdn.net/liujiahan629629/article/details/47061727
2016/11/10 Kettle Overview