ETL Tool Pentaho Kettle's transformation and job integration
1. Kettle
1.1. Introduction
Kettle is an open-source etl Tool written in pure java. It extracts data efficiently and stably (data migration tool ). Kettle has two types of script files: transformation and job. transformation completes basic data conversion, and job controls the entire workflow.2. Integrated Development
2.1. transformation implementation Parsing
// Initialize the Kettle envir
>slave1-8081name>hostname>localhosthostname>Port>8081Port>username>Clusterusername>Password>ClusterPassword>Master>NMaster>Slaveserver>Slave_config>We opened is a slave server, so look at Slaveserver inside the configuration of username and password, to, The default is cluster, here is the configuration value of your login account Password. You can now log in to the configured carte Server.Come in and find nothing, this is normal, because we also need to configure the kettle job and transformati
1. Create a new dashboard
After logging on to pentaho, click File-> New-> new dashboard to create a new dashboard.
Ii. new dashboardThe new dashboard is shown in the following figure: layout structure, component panel, and dashboard panel.
3. Layout StructureLayout structure is used to manage the layout of the entire dashboard. Generally, a dashboard is composed of tables, which are divided into row and column. You can click an element to edit it
Tags: 1 Download the JDBC driver for SQL Server first. See the following link address: [1] http://msdn.microsoft.com/en-us/data/aa937724.aspx[2] Google input into SQL Server JDBC is also available. [3] Here Sqljdbc4.jar is the jar package we need 2 download Pentaho of the multidimensional data server mondrian and [1] http://sourceforge.net/→ input Mondrian Download { As of press time, the latest version of mondrian3.5.0} backup address is as follows:
Using Metadata generated by Pentaho Metadata Editor (PME) as a data source
Pentaho report Designer (PRD) can support a variety of data source input methods. Pentaho Metadata Editor as a member of the home platform, it should be a cinch. Right.
Take into account the actual situation, directly on the use of parameters examples.
1. Similarly, create a new parameter
Pentaho biserver Community edtion 6.1 is a kettle component that can run kettle program scripts. However, since Kettle does not publish directly to the BISERVER-CE service, the ETL scripts (. KTR . KJB) that are developed in the local (Windows environment) through a graphical interface need to be uploaded to the Biserver-ce managed repository. Can be run and dispatched by Biserver-ce.Focus: Kettle Repository and BISERVER-CE resource pool establish a c
Website Link: http://wiki.pentaho.com/display/EAI/Pan+User+DocumentationPanA Pan is a program that can perform a transformation that is edited using spoon.The decompression of PDI Software.zip has been pan.batcommand line using pan to execute transformationThe official website mainly introduces the commands under the Linux platform, I mainly introduce the commands under the Windows platformOptions optionFormat/option: "Value"Parameters parametersFormat "-param:name=value"Repository Warehouse Sel
multiple) under Properties: Locale (Specify the country language code, such as: EN_US,ZH_CN Value: the corresponding text (5) Localized_tooltip/tooltip (plugin hint text, can be multiple) under Properties: Locale (Specify the country language code, such as: EN_US,ZH_CN Value: the corresponding text C. Second way: Scan All of the jar packages in these three directories have type-corresponding declared classes (this method needs to be done through the definition file) type of interface
applications. All you need is the method of dialog between Java applications and different databases. JDBC serves as a mechanism for this purpose.
JDBC extends Java functions. For example, you can use Java and JDBC APIs to publish a webpage containing an applet, and the information used by the applet may come from a remote database. Enterprises can also use JDBC to connect all employees to one or more internal databases over the Intranet (even if these employees use different operating systems
This example is simple and the difficulty lies in the installation of your Hadoop2.20 plugin (my previous blog post). The steps to implement are as follows:
1. Create a job
Create a kettle job to achieve the following effects.
2. Configure Hadoop to copy files
Configure the Hadoop Copy files component to achieve the following effects:
3. Testing
Click the Run button to get the effect shown below, stating that your configuration was
website Link: http://wiki.pentaho.com/display/EAI/Call+DB+Procedure DescriptionCalling the database stored procedure step allows the user to execute a database stored procedure and obtain the results. Stored procedures or methods can only return
1. It can only have one primary query result set.
2. Place the $ {parameter name} parameter in the SQL statement at the bottom of the DATA page in the upper-right corner. The parameter must have a default value. Otherwise, no DATA is displayed in
I. Extracting data from HDFS to an RDBMS1. Download the sample file from the address below.Http://wiki.pentaho.com/download/attachments/23530622/weblogs_aggregate.txt.zip?version=1&modificationDate =13270678580002. Use the following command to place
The layout of the dashboard is very troublesome. This CSS layout is a virtue. This mainly involves layout and compontens layout size settings.
Layout layout includes row objects and column objects, as well as images and HTML objects.
A row object
Http://www.aboutyun.com/thread-7450-1-1.html
There is a very large table: Trlog The table is about 2T.Trlog:CREATE TABLE Trlog(PLATFORM string,user_id int,Click_time String,Click_url string)Row format delimitedFields terminated by ' t ';
[-]
Previously, vs6 has never encountered any problems. During the recent installation, there was always an acmbootexe sending error. It is estimated that some compatibility problems were caused by my XP system version.
Google found the solution
Copy vs98entstf under setup in the installation directory and name it acmsetupstf.
Copy all files in setup to the install
Bower installation use and git installationBower Required: node and git1. Git Install: (select Second: Use Git from the Windows Command Prompt)2. Node Installation:3, determine, configuration is complete,Open cmd, enter node-v, Node-v, V is the meaning of version, check the revision number, verify the installation is successful4, install BOWER:NPM Install-g Bower
first step preparation, second step installation
MYSQL
, the third step installs
PHP
, fourth Step installation
NGINX
, Fifth Step installation
MEMCACHED
and PHP extensions
After the server is first made, the individual prefers to reset the server name
To view the hostname of the CentOS, you can view it by command:
Hostname
-------------------
One: the preparato
Flask relies on two external libraries: Werkzeug and JINJA2. Werkzeug is a toolset for WSGI (standard Python interfaces developed and deployed between Web applications and multiple servers), and JINJA2 is responsible for rendering templates.
First, installation
Prerequisites for flask Installation
1. python2.x version has been installed
2. Installed Easy_install
Before installing flask, you have to ins
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.