Application of Infosphere DataStage running time column extension (RCP) in ETL

Source: Internet
Author: User
Tags port number

General description

With the development of Enterprise Informatization, a large number of enterprises are setting up a business intelligence system based on the characteristics of their own industry to guide business operation. The business intelligence system with reasonable design and efficient operation plays a more and more important role in the business decision-making.

IBM Infosphere Information Server serves as the basis for an extensible enterprise Information architecture that meets the needs of the enterprise for a huge amount of information, enabling enterprises to deliver high quality business results faster in their operations. IBM Infosphere DataStage, as an important part of it, supports data structure from simple to complex data collection, transformation and distribution, through high-performance processing of large amounts of data to solve the enterprise based on large-scale data business problems.

Runtime column extensions (Runtime column propagation) is an enhanced feature provided by Infosphere DataStage that enables user-defined jobs (job) to import not only the data fields that users define in the job, You can also dynamically expand the columns of interest that users want to import at run time, and he is at run time to determine an attribute of the data column that the job needs to import. This provides the possibility that you can change the content of the data that the job imports by simply changing the schema file at run time without modifying the job.

A prerequisite for using this technique is that the defined project supports run-time column extension properties. You can add or remove support for this feature in the project's Properties window. Typically, the phase of the data source (stage) needs to specify a column-defined file containing the source data, called a schema file, that dynamically updates the contents of the schema file at run time, and the DataStage job can dynamically read data from the data source file at runtime.

The phase of the schema file that can be received in DataStage 8.7 is as follows.

Sequential File

File Set

Column Import

Column Export

External Source

External Target

This article begins with a description of the settings that are supported by the Runtime column extension project level, and how to create a schema file. Next will be based on years of business Intelligence project experience, virtual out of the typical RCP use scenario, step-by-step implementation of RCP in Infosphere DataStage use, give each detail, including the design of the job, each phase of the parameter settings, detailing how RCP is in the ETL to reuse Dat Astage operations to improve the quality of data processing operations.

Start using RCP

Project-level settings support RCP

Open the Administrator Client, open the login interface as shown in Figure 1 below, enter the server address and port number for the Infosphere Information Server, the default port number is 9080, enter the username password, click the following list box, select the installed Infosphere Information Server's engine name, all selected complete, click Login.

Figure 1. Administrator Client Login window

Login success, you will enter the following figure 2 of the Project Selection window, click Projects tab, select the correct item, click the attribute, enter the property Selection window.

Figure 2. Project Selection window

Select the Properties window for the project, as shown in Figure 3 below, and select Enable Runtime Column propagation for Parallel Jobs.

Figure 3. Project Properties Settings window

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.