Because both of them are used, informatica is easy to manage in the future, especially for data correction. when data is supplemented in the later stage, the data stream is clear at a glance.SQL is efficient, but it is inconvenient to maintain it later. It takes a long time to find a data stream ..ETL tools are easier to manage and maintain, especially complicated cleaning processes.
data from the total data source into the database tables in each subsidiary, at this time the subsidiaries in the development of the report only need to connect their own database tables, so that the control of data rights, but also better the data of the subsidiaries in the various subsidiaries of the database table.three,Project Construction Plan:1Kettle Introduction to the tools usedKettle is a foreign open source
1, Ali Open source software: datax
Datax is a heterogeneous data source offline Synchronization tool that is dedicated to achieving stable and efficient data synchronization between heterogeneous data sources including relational databases (MySQL, Oracle, etc.), HDFS, Hive, ODPS, HBase, FTP, and more. (Excerpt from Wikipedia)
2. Apache Open source software: Sqoop
Sqoop (pronunciation: skup) is an open source tool that is used primarily in Hadoop (Hive) and traditional databases (MySQL, PostgreSQ
See you share a lot of Hadoop related content, I introduce you to an ETL tool--kettle.Kettle is an ETL tool of Pentaho company Open source, like Hadoop, is also Java implementation, the purpose is to do data integration when the data extraction (Extract), conversion (Transformat), load (loading) work. There are two script files in Kettle, transformation and job,t
These years, almost all work with ETL, have been exposed to a variety of ETL tools. These tools are now organized to share with you.
An ETL Tool
Foreign
1. DataStage
Reviews: The most professional ETL
. Hold down the SHIFT key and drag the "table input" icon from "Convert 1" to "table output" to establish a connection. Notice that the arrow is in the opposite direction. 3. Double-click "Table Input" to configure the relevant content Figure 4: Configuring the relevant content in table input Figure 5: Test result diagram Figure 6: Configure the SQL statement to query the specified table. And you can see the records in the table by "preview" Note: You can see the records in the table, which
The kettle of ETL tools extracts data from one database into another database:
1. Open the ETL folder, double-click Spoon.bat start Kettle
2. Resource pool selection, Connaught no choice to cancel
3. Select Close
4. Create a new transformation
5. Configure the required database
6. The data table that needs to be extracted, with the table input
considerServices sets the preferred instance, the standby instance. Once a single point of failure occurs for the preferred instance, services automatically failover to the standby instance.If the current RAC database is defined with 3 nodes Srv1,srv2,srv3There are two different service sales.2gotrade.com and settlement.2gotrade.com running in the current databaseThe Sales department establishes the connection through the Sales.2gotrade.com service name, and the Settlement department establishe
First, the purposeMerge tables on different servers onto another server. For example, merge table B on server 1 on table A and server 2 to table C on server 3Requirements: Table A needs to be cropped (removing unnecessary fields), table B needs to add some fieldsIi. Methods of Use(1) Create a new Table C (field that conforms to the actual system design) in the database on server 3(2) Create a new table input, connect to server 1, select the table you want to use by getting the SQL statement, or
Tags: sha feature ima Oracle ROCE-O technology share OSS settingsThe value mapping here is a bit like the Oracle's CAS when feature, such as a field a value of 1, but I now want to make the a=1 of a male, that is, 1 mapping into a male, this is the value mapping, then how to operate, in fact, Kettle has a "value mapping" component The following is a brief introduction to how to use;First enter the value mapping in the search box to the left of the program, find the value mapping component, and t
ETL is responsible for the distribution, heterogeneous data sources such as relational data, flat data files, such as the extraction of the temporary middle tier after the cleaning, transformation, integration, and finally loaded into the data warehouse or data mart, to become the basis of online analytical processing, data mining.
If the frequency of data conversion or not high requirements can be manually implemented
: mainly in the United StatesKettle: technical support personnel can be found in the United States, Europe (Belgium, Germany, France, UK), Asia (China, Japan, and South Korea.Informatica: all over the worldInaplex inaport: mainly in UK
Deployment:Talend: Create a Java or perl file and run it using the Operating System Scheduling Tool.Kettle: You can use job or operating system scheduling to execute a conversion file or job file, or deploy it on mult
SQL statements manually.Kettle: Data quality features in the GUI, you can manually write SQL statements, Java scripts, regular expressions to complete the data cleansing.Informatica: A product dedicated to Informatica data quality to ensure qualityInaplex Inaport: Data cleansing is easier because only specific data is processed.
Monitoring:Talend: There are monitoring and logging toolsKettle: There are monitoring and logging toolsInformatica: Very d
In the past, we used the underlying C-API of each database as wrapping to realize the function of data import and export between several heterogeneous databases. However, the code is complex and it is inconvenient to open source.
In the afternoon, a simple data extraction program was written in Java to port the MySQL database to Sybase ASE. Put it open-source, put it on: http://code.google.com/p/jmyetl/ top. I originally named myetl, and someone appl
The most comprehensive Java byte operations, conversion and hexadecimal conversion tools for processing basic Java data, common tools for streaming media and underlying java development projects, and javabyte tools
Conversion and
Java performance analysis tools, Part 1: Operating System Tools, java operating systemsIntroduction
The premise of performance analysis is that the running status of the application and the running environment of the application are displayed more directly in a visual manner. How can we achieve this visual display? We
* @param x * @return */public static byte inttobyte (int x) {return (byte) x; }/** * byte to int * @param b * @return */public static int Bytetoint (byte b) {//java byt E is signed, converted to unsigned return B 0xFF via 0xff; }/** * byte[] to int * @param b * @return */public static int bytearraytoint (byte[] b) { Return b[3] 0xFF | (B[2] 0xFF) The most comprehensive
Highlights of efficient Java programming tools and Java programming tools
Java developers often find ways to write Java code faster and make programming easier. At present, more and more efficient programming
10 recommended tools for Java developers and 10 recommended tools for java developers
The following are the 10 most commonly used tools for Java programmers in their daily lives. If you are using
IntroductionThis article is the second article in the Java Performance Analysis tool series, the first article: Operating system Tools. In this article, you will learn more about Java applications and the JVM itself using built-in Java monitoring tools. There are many built-
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.