Data integration is currently a hot topic, and there are more and more related products and platforms. Many CIOs are hesitant about Data Integration platforms and products. Therefore, a comprehensive understanding of the framework system of the data integration platform and a deep understanding of the functions provided by the products of various manufacturers can provide a reliable guarantee for the decision-making of the data platform.
I have the honor to have participated in the design of an integration platform for a well-known Enterprise in China, and led the demand analysis and product selection of the data integration platform. In this work, I have studied many new technical directions and products. Next I will focus on an emerging product platform in the data integration field, Master Data Management (MDM ).
Concept of primary data
First, we will introduce what is primary data. Here we borrow a data classification model from other websites. We can see metadata, reference data, and master data ), enterprise structure data, transaction activity data, and transaction audit data.
A brief explanation of these six categories of data, the definition of these types of data can be easily found on the Internet.
Metadata: Data. When designing a table, most attribute fields are metadata. For example, gender, nationality, and province of birth. This is the data closest to the natural meaning.
Reference data: the possible value range of metadata. When designing a table, the data dictionary often refers to reference data. For example, a gender can only be male and female, while a male and female can reference data. The cited data of countries is over 100 countries and regions in the world;
Primary data: the most important entities in our database design are the collection of metadata and reference data instances. Dmreview columnist Jane Griffin defines the primary data as "... it is used to create and maintain a full-enterprise 'record system' for core business entities to record business transactions and assess the information required for the performance of these entities." The customer information and product information we often encounter usually belong to the primary data. The introduction of primary data will be detailed later.
Enterprise structured data: the data entity required by an enterprise's business. It may be a collection of multiple primary data. Structured Data in different industries is very different.
Transaction activity data: data generated by activity between primary data. For example, the transaction record of the purchased product is the transaction activity data, the factory production product, and the production record is also the transaction activity data.
Transaction audit data: we record all activities of the data through transaction audit data. For example, we modify customer information and add or delete transactions. These activities need to be recorded in many key systems (such as banks, comply with the requirements of relevant regulations (such as Basel II and Sarbanes-Oxley Act ).
The deeper the blue in the data model, the stronger the semantic correlation and the more important the data quality, the deeper the yellow color, the more data the data is, the faster the update frequency, the faster the Real-Time captured data, and the shorter the data life. As you can see, metadata has the strongest semantics, almost no updates, the least data volume, and the longest life cycle.
Http://www.dmreview.com/issues/20060401/1051002-1.html
Master Data is the most basic business unit in an enterprise application system. The following is an original article in English: master data is the fundamental business data in the company, typically long-lived and used processing SS multiple applications.
Core master data are operational entities, supporting all fundamental business activity transactions being executed on this level. The core master data are common and retriable within the Organization.
I think it is easy to understand. For example, in a product system, the basic data it processes is the production data. Employees are the basic data processed by the HR system, and customers are the CRM system.
Generally, the core primary data includes: customers, contracts, suppliers, distributors/partners, and employees.
In addition, various industries have very different requirements and expectations for primary data management. Therefore, industry experience is also very important for primary data management.
Concept of primary data management
From the above introduction, we can understand that primary data is not a new concept, but why is there no primary data management product before? In fact, it is similar to explaining the cause of data integration. Because the primary data is attached to individual business systems, such as HR, ERP, SCM, enterprise websites, and business partner systems, the primary data may be stored in a certain primary data, for example, a product. The problem arises, such as inconsistent data encoding between systems, data redundancy, and incomplete data of some systems. For example, if we create a Bi system, we may need, multiple systems of the enterprise partner system obtain a complete set of primary data information. Obviously, a solution is needed to provide a single primary data access interface to improve the efficiency of primary data access; it provides reliable data for marketing, sales, customer relationship management, and other activities to improve the agility of enterprises.
Master Data Management: master data management is an advanced form of data management. It must be built on ETL, enterprise information integration, and other technologies, therefore, many primary data management platforms include functions such as data extraction, data loading, data conversion, data quality management, data replication, and data synchronization. Some vendors also deliver MDM to customers as a module of data integration products.
Access to primary data without primary data management
Problems caused by scattered primary data in various systems: redundant data in various systems, resulting in tedious data access; inconsistent coding; inconsistent data synchronization, lack of consistency; has the following impact on enterprise business: delayed product market time; Product Supply in short supply; inaccurate order delivery; low sales efficiency; reduced customer satisfaction; and reduced productivity.
Benefits of the primary data management platform:
There is a unified primary data access platform; enterprises can provide a consistent and complete information sharing platform; centralized and rich content and clean data centers; for the use of data applications, enterprise business processes and decision making systems provide a real data access channel. I personally feel that after the establishment of the MDM platform, bi-related applications are the most beneficial.
Function module of the Primary Data Management Platform
Master repositories: X-Ref dB, Masters dB, master data applications; data quality (data quality assurance): checks the quality of source data, all data transmitted from the source data system to the data recovery zone should be imported only after the quality check, the quality inspection of source data should include confirmation of standardization of interface data file format, confirmation of file size, confirmation of number of records, confirmation of file generation time, etc. ETL system quality inspection, this includes checking the relationship between the primary and Foreign keys and encoding specifications. Each data extraction, conversion, and loading must have a complete log record, and confirm that the number of records is consistent after loading.
Data enrichment (deep data analysis and Relevance Analysis): internal enterprise analysis;
System integration (integration component): Master Data Manager; Service Bus (providing data service); Exception Handling; mapping/conversion/loading ); data exchange; workflow; Business System; metadata/master data access control; data entry control; data collection; management/security
Implementation of the main data management platform project
Like many integrated projects, the most important thing for project implementation is to develop business strategies and plans. The key is the understanding and analysis of data by business personnel, business needs, and industry experts, the technical platform is only an important tool for realizing our ideas and will not play a decisive role.
Providers that provide the primary data management platform
Traditional ERP vendors: Both SAP and Oracle have added primary data management products to their ERP software packages based on their own product experience. Based on their industry experience, their products have relatively complete master data management data models and master data management experience. Based on its deep experience in the CRM and manufacturing industries, Oracle provides comprehensive customer master data products ucm8.0 and manufacturing owner data products pim12.0,
Middleware vendor: tibco has specialized MDM products. I saw their product introduction a year ago. I feel that the features are still lacking and many important features are missing, of course, I have no time to study their latest products.
IBM has acquired an MDM product. I have never been reluctant to study IBM products, so I have no right to speak. I can also use Software AG (webmethod). I have specialized products with relatively complete functions. However, the implementation team is less powerful.
Oracle product information can be downloaded here: woohooli
For the primary data management platform, I will discuss some details in my blog later. You are welcome to give your comments.
Oracle has a clear strategy and roadmap on MDM products. After the acquisition of BEA, we believe that Oracle will combine the previously powerful ODI tools and the features of Bea in the data integration field with its original industry experience, provide more comprehensive products to consolidate their leading position in middleware Products.
This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/tiger119/archive/2009/04/14/4071348.aspx