Keywordsdata integration etl data integration solution
Also called ETL, it extracts, merges, cleans and standardizes cross-departmental business data. The processed data can be stored in the master data repository to provide a consistent data view (master data management) for each business system, and can also follow the theme data format Store in the big data platform for
data analysis and mining (
data warehouse,
big data). Common problems of data integration are as follows:
1 There are many departments involved, and the types of data interfaces are diversified.
2 Missing top-level design or historical reasons, lack of unified data standards, information islands, and serious data fragmentation.
3 A series of problems such as data redundancy, data inconsistency, and erroneous data make data quality low and difficult to retrieve and utilize effectively.
4 For business data that changes at any time, how to efficiently merge it into the main data storage or data warehouse storage.
The implementation of data integration projects has the following advantages:
1 The software supports a wide range of data interfaces, supporting various mainstream databases (Oracle, DB2, SQL Server, MySQL, PostgreSQL, Informix, MongoDB, Redis, Teradata, SAP Hana, etc.), external files (text, XML, Excel), large Data storage (Hive, HBase), message server (Kafka) for read and write access.
2 The software provides a data federation function, which can merge business data across databases. Support various mapping conversions, such as type conversion, field operation, reference conversion, string processing, character set conversion, null value processing, date conversion, aggregation operation, predetermined value, field segmentation, field merging, etc.
3 The software supports rules-based data cleaning, filtering, conversion and other functions. The simple and intuitive graphical operation interface helps users to achieve data standardization efficiently.
4 The software provides incremental extraction methods such as timestamps, triggers, log analysis, and supports various cleaning and conversion processing of incremental data. The processed data can be stored in the database, big data storage or sent to the Kafka message server.
5 The software provides the workflow scheduling function, which is used to schedule and manage the execution sequence, trigger conditions, and abnormal logic of related tasks.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.