IBM Infosphere DataStage is the industry's leading ETL (Extract, Transform, Load) tool that uses the Client-server architecture to store all projects and metadata on the server side, It also supports the collection, integration, and conversion of large amounts of data in a multiple structure. The client DataStage Designer provides a graphical development environment for the entire ETL process, and the user designs and develops the DataStage Job in Designer. DataStage provides a number of process processing Stage to meet the needs of ETL, however, Transformer Stage in these Stage use and use the most widely, this article will be Transformer Stage in the ETL The usage of the process and the function of implementation are described in detail. In this article, IBM Infosphere DataStage is involved in the IBM Information Server 8.0.1 version.
Introduction to Transformer Stage components
Transformer Stage in DataStage is an important, http://www.aliyun.com/zixun/aggregation/17547.html "> powerful component, in the ETL process, it assumes" T "( i.e. the conversion of data. In Transformer Stage, you can specify the source and destination of the data, match the corresponding input and output fields, and specify the conversion rules and constraints.
Figure 1. Application of Transformer Stage in DataStage job
Figure 2. Transformer Stage column mappings and field expressions
function and case analysis of Transformer Stage in DataStage job
1. Field Conversion
Field conversion is one of the most common features of Transformer Stage, which transforms the source data into target data according to certain specifications. The following is an example of how the transformation of a field is implemented by using Date and Timestamp that are more common in the ETL process.
1.1 Source data type is Timestamp, target type is Date
Listing 1. Time conversion function
Timestamptodate (in. Add_date)
Figure 3. Field Transformation expression
Figure 4. Before and after field conversion
1.2 The source data type is Date and the target type is Timestamp
This conversion requires that the Date type be converted to the Varchar type first, and then the Varchar is padded with the required time of the Timestamp and then converted to the Timestamp type
Listing 2. Time conversion function
Stringtotimestamp (Datetostring (in. Add_date, "%yyyy-%mm-%dd"): ' Unlimited '), "%yyyy-%mm-%dd%hh:%nn:%ss")
Figure 5. Field Transformation expression
Figure 6. Before and after field conversion