Infosphere DataStage Job Verification steps for Optim Test Data Management Solution

Source: Internet
Author: User

Introduction: Verifying the requirements of DataStage operations

Today, companies are implementing information-centric projects to transform their businesses and achieve cost savings. Many data integration or information integration applications or processes contain ETL and serve as one of the components.

Typically, an ETL process (unit of work) is designed to perform the following tasks:

Extraction: Extracts data from the source system and collates it.

Transformations: Converts data to the desired format that can be used in the next step. Typically, this involves applying core business logic to transform data into information.

Loading: Typically, data is loaded into a database table/warehouse for the reporting engine to gain insights from transformed data.

Jobs in a data integration application experience two common life cycles

Porting/Migrating a job from an older version to a new version of the DataStage software or hardware running it.

Migrate jobs from the development environment to the test environment and then to the production environment.

Both of these use cases need to validate a large number of DataStage jobs. Businesses often verify that jobs that run in a new version of the software or in a new hardware environment will produce the same results as before, making them confident that the new system will replace the old system. Similarly, before you deploy a job in a data integration process to a production environment, you must identify the behavior that is expected in the development, testing, and production environments.

This article provides a step-by-step example of how DataStage users can use the IBM infosphere Optim Test Data Management Solution to validate the results of an ETL job.

Use the Optim Test Data solution for DataStage

In the validation process for the DataStage job, Optim Test Data Solution can be used to

Generating test data

Compare job output to one expected or datum output

During the validation process, the DataStage job references the generated test data as the input source. After the DataStage job is executed, a comparison step is performed to verify the final output.

The workflow can be represented as shown in the diagram.

Figure 1. Verifying the workflow of a DataStage job using Optim TDM

In subsequent sections, you will see an example of using the DataStage job to generate test data and then comparing the final result with the expected results to validate the job.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.