Kettle is an open-source ETL Tool written in Java. It can be run on Windows, Linux, and Unix. It does not need to be installed green, and data extraction is efficient and stable.
Kettle is named a pot in Chinese. The project's main programmer Matt wants to put all kinds of data in a pot and then flow out in a specified format.
Kettle is an ETL tool set that allows you to manage data from different databases. It provides a graphical user environment to describe what you want to do, rather than what you want to do.
Kettle has two types of script files: Transformation and Job. Transformation completes basic data conversion, and job controls the entire workflow.
Kettle can be downloaded at http://kettle.pentaho.org.
Note: ETL is short for extract-transform-load. It is used to describe data extraction, transform, and load from the source) to the destination. ETL is commonly used in data warehouses, but its objects are not limited to data warehouses.
Kettle 5. x User Guide download URL: http://download.csdn.net/detail/fan_hai_ping/8030177
The software version used in this tutorial is as follows:
1) hadoop (1.2.1)
2) pantaho dataintegration (5.2.0)
3) hbase (0.94.19 ).
Note: If you encounter any problems during reading or using the tutorial, please join us !!
Kettle 5.x User Guide