General description
With the development of Enterprise Informatization, a large number of enterprises are setting up a business intelligence system based on the characteristics of their own industry to guide business operation. The business intelligence system with reasonable design and efficient operation plays a more and more important role in the business decision-making.
IBM Infosphere Information Server serves as the basis for an extensible enterprise Information architecture that meets the needs of the enterprise for a huge amount of information, enabling enterprises to deliver high quality business results faster in their operations. IBM Infosphere DataStage, as an important part of it, supports data structure from simple to complex data collection, transformation and distribution, through high-performance processing of large amounts of data to solve the enterprise based on large-scale data business problems.
Runtime column extensions (Runtime column propagation) is an enhanced feature provided by Infosphere DataStage that enables user-defined jobs (job) to import not only the data fields that users define in the job, You can also dynamically expand the columns of interest that users want to import at run time, and he is at run time to determine an attribute of the data column that the job needs to import. This provides the possibility that you can change the content of the data that the job imports by simply changing the schema file at run time without modifying the job.
A prerequisite for using this technique is that the defined project supports run-time column extension properties. You can add or remove support for this feature in the project's Properties window. Typically, the phase of the data source (stage) needs to specify a column-defined file containing the source data, called a schema file, that dynamically updates the contents of the schema file at run time, and the DataStage job can dynamically read data from the data source file at runtime.
The phase of the schema file that can be received in DataStage 8.7 is as follows.
Sequential File
File Set
Column Import
Column Export
External Source
External Target
This article begins with a description of the settings that are supported by the Runtime column extension project level, and how to create a schema file. Next will be based on years of business Intelligence project experience, virtual out of the typical RCP use scenario, step-by-step implementation of RCP in Infosphere DataStage use, give each detail, including the design of the job, each phase of the parameter settings, detailing how RCP is in the ETL to reuse Dat Astage operations to improve the quality of data processing operations.
Start using RCP
Project-level settings support RCP
Open the Administrator Client, open the login interface as shown in Figure 1 below, enter the server address and port number for the Infosphere Information Server, the default port number is 9080, enter the username password, click the following list box, select the installed Infosphere Information Server's engine name, all selected complete, click Login.
Figure 1. Administrator Client Login window
Login success, you will enter the following figure 2 of the Project Selection window, click Projects tab, select the correct item, click the attribute, enter the property Selection window.
Figure 2. Project Selection window
Select the Properties window for the project, as shown in Figure 3 below, and select Enable Runtime Column propagation for Parallel Jobs.
Figure 3. Project Properties Settings window