Basic Initial Load (BIL) Extension Guide

Source: Internet
Author: User
Keywords basic initial extension Guide
Tags aliyun basic data data quality data type datastage error error message
In an easy-to-understand way, this article focuses on how to leverage existing structures to successfully add more tables to the http://www.aliyun.com/zixun/aggregation/18278.html "> data type that BIL currently loads into the MDM database.

The primary purpose of the MDM Basic Initial Load (BIL) DataStage project is to quickly insert data into a predefined set of core MDM database tables. Although this set of database tables contains both party and Contract domains, they represent only a subset of all MDM database tables.

The record definitions used to load these core database table records form the standard interface format (SIF) file. By design, BIL will use a component structure to process SIF files to load data into the core MDM database tables.

This article outlines the technical aspects of BIL design and provides detailed guidance on how to extend this functionality with component styles and BIL structures. You will also introduce the steps to extend BIL to load additional data elements and 6184.html ">" data tables.

Technical Overview

The BIL consists of the following 7 different functional areas: import, validation, preparation, data quality error merging, assignment identifier/INSERT, data quality error reporting generation, and cleanup.

Calls to many functional areas are mutually exclusive. The validation and preparation functions are driven primarily by job parameters. If this parameter is set, one or more assets will not be invoked. Data quality error Merging is driven by the presence or absence of an error message. The data quality error reporting generation function runs only when the data quality error consolidation feature is invoked. Assignment identifiers/Inserts run only if no error message exists.

Cleanup functions are run independently of other zones. The cleanup feature is used for archiving, deleting intermediate datasets, and any associated error message files. The cleanup feature uses a wildcard naming convention to recognize both intermediate datasets and error message files, so you do not need to adjust them if you extend other functional areas.

Figure 1 shows the Basic Initial Load (BIL) functional area and their interrelationships.

Figure 1. BIL function Flow

Extension Guide

This section consists of an extension guide for import, validation, preparation, data quality error merging, assignment identifiers/inserts, and data quality error reporting generation.

Import Features

In general, the import function is a pattern implementation. This pattern contains the following steps.

Reads the delimited data into a text data field until no more records are to be read.

Splits or extracts the record type identifier.

Splits or extracts an identifier.

Use the record type identifier to identify the record metadata.

Splits or extracts a record column from a Text data field to fill the record with the correct type of data.

Identify and manage records with duplicate records and business key identifiers.

Writes records to the correctly named data store for downstream processing.

Captures and records all errors with formatted error messages.

According to the description, the pattern is suitable for implementations that utilize any number of tools and techniques. The flowchart shown in Figure 2 shows a representation of this pattern.

Figure 2. Import Mode flowchart

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.