Two-time development manual

Source: Internet
Author: User

Two-time development manualClassification:ODI (+)

Directory (?) [+]

1 Introduction 1.1 Purpose of writing

This manual is intended for developers who have knowledge of the data integration business and know about ODI operations as a reference manual for two development of the KM based on the ODI. Detailed description of how to do the KM two times on the ODI to meet the individual needs of user data integration in specific situations.

2 Development templates km introduction 2.1 miles overview

KM (Knowledge Modules: Knowledge module) is a set of code templates in ODI. During the integration process, each km corresponds to a specific task, and the entire data integration process completes the integration effort by selecting several km code templates to generate execution code.

The KM is abstract and reusable, it is a description of the rules and procedures of the integration process and is an abstraction of the logical tasks, regardless of the physical objects (such as data tables, physical paths, columns, etc.). When integrated, the user injects the interface, model, metadata information stored in the package (database connections, mapping relationships, etc.) as parameters into the KM by invoking these rules and procedures, so the KM is similar to an abstract interface that separates from the specific business object so that a single km can be made available by multiple integration projects. So the KM is a highly abstract and summary of the process and rules of data integration, and the complexity of data integration can be greatly reduced by developing a complete library of KM libraries.

The ODI platform prepares multiple km for different integration scenarios and processes, allowing users to complete different integration requirements by invoking these km, and the KM also allows users to expand and rewrite themselves, and when existing km templates are unable to meet integration requirements, individual requirements for special scenarios can be achieved through custom authoring km.

2.2 km classification

ODI divides the KM into the following categories depending on the integration process and functionality: RKM (Reverse km), CK (Check km), LKM (Load km), IKM (integretion km), JKM (journalizing km), SKM ( Service km), each class of km complete a specific class of functionality, as shown in table 2-1 below:

KM type

Describe

Usage Scenarios

RKM

Reverse engineer, extract metadata

Using in a model, extracting meta data

CKM

Check if data meets constraints

Used in models, for data consistency

Used in interfaces, for data flow control

Lkm

Loading heterogeneous data into staging area

Use in interfaces to load heterogeneous data sources

IKM

Consolidate staging area data to target

Using in an interface

JKM

Create a change data capture framework

Use in model, turn log on or off

SKM

Generate Data Manipulation Web services

Using in the model

Table 2-1:KM Classification and function

Each type of km is described in detail below:

2.2.1 RKM

The primary responsibility of RKM is to reverse the model data into the work repository. RKM connects to the data source and application system, extracts the data from the source, the data storage, the fields and so on, cleans, transforms and loads into the snp_rev_sub_model,snp_rev_table table. RKM then updates the information to the work repository by calling the Odireversesetmetadata API. Depending on its inverse data source, RKM is divided into the following types RKM Oracle, RKM DB2, RKM File, RKM SQL (JYTHON), and so on. 2-1 for RKM Workflow:

Figure 2-1:RKM Work Flow

2.2.2 CKM

CKM is primarily used for data logging consistency checks and defining constraint relationships, primarily for two places:

n Check for static data consistency: This type of CKM acts on the model by setting constraints that can cleanse, filter data, and integrate only the required data on demand, for example, by adding CKM to take student data older than 18.

n Check the data during loading: This type of CKM is used in the interface to control data during the data flow process, and this ckm can be enabled by changing the Flow_control option in the designer to Yes.

N another ckm will create an error table with the prefix e$ in the staging area to hold illegal data.

Its processing flow is shown in 2-2:

Figure 2-2:ckm

Figure 2-3: Streaming Data Control

2.2.3 LKM

LKM is primarily used to read data from data sources into staging areas, to use lkm in the design of interface, to store source data in the C $ table of the staging area, and its main work is to:

n lkm extracts data from a remote data source and loads it into the staging area for use in the interface;

N Lkm creates a C $ table in the staging area and loads the data into the C $ table;

N LKM performs some simple pre-conversion work, just like a SELECT statement in an SQL statement.

In addition, for different data sources, corresponding to different LKM, such as: LKM SQL to SQL (JYTHON), LKM SQL to SQL, LKM Oracle to Oracle (DBLINK), LKM MSSQL to Masql (BCP), etc., its processing flow As shown in Cheng 2-4:

Figure 2-4:lkm Work Flow

2.2.4 IKM

The role of IKM is to load the converted final data into the target, ikm to ensure that all data has been loaded into the staging area beforehand through LKM, ikm directly from the staging area to get the data. IKM are divided into two types depending on the location of the staging area. The first type is the ikm of the staging area on the target, and the other is not on the target side of the staging area, and the process is

n Select the ikm type in the interface according to the actual scene

N Source is the C $ table of the staging area, which is converted to generate the result set, the result set is stored in the i$ table, and then the i$ table is merged into the target table.

N if CKM is added, data that is not compliant during the merge is placed in the e$ table by ckm

Its flowchart 2-5 is as follows:

Figure 2-5: Staging area on the destination server

Figure 2-6: Staging area is not on the target server

2.2.5 JKM

JKM's role is to capture change data, primarily as a capture of incremental changes at the CDC (change data capture), to prepare for incremental integration. JKM is used to integrate the changed data into the journal table (j$) by establishing a trigger on the data source or by logging bits, marker bits, and so on, for integration into the target.

N jkm is used on the model and cannot be used in the interface

n jkm automatically sets and creates the parameters, tables, triggers, and views required by the framework.

The N jkm framework also contains the metadata for the CDC maintenance

Its processing flow is shown in 2-7:

Figure 2-7:JKM Processing Flow

2.2.6 SKM

The role of SKM is to deploy data operations as Web Service and publish them to the SOA architecture. SKM acts on the model, unlike other km,skm that do not produce executable code, but only the deployment profile, which is compiled and then deployed to the container of the application server, so that by SKM, the Web service can be accessed to the ODI published data service.

3 km development rules

The above section introduces the basic content of the KM, the system framework and the functions and uses of each km, this part will start from the actual introduction of how to customize the development of a km for integration purposes, the following sections from the development language, development platform, development rules and other parts of the detailed introduction.

3.1 km development language

In the development of language rules, the KM supports Jython, standard SQL, and ODI's substitution Method, which can be composed and assembled to compose the KM in these scripting languages or development languages;

Different knowledge modules, different steps, may not show the same, some are SQL statements, some are Java language (Jython script), and then there will be some similar to Odiref.getinfo () function, The specific meanings of these macro variables and functions can be found in the ODI reference documentation. Each of the steps of the knowledge module can be either deleted or added, so the ODI extension can be implemented very easily. By modifying the form of a new knowledge module on an original knowledge module is the easiest way to customize, and the other way is to fully write the knowledge module through the language of Jython, it is a more flexible and complex knowledge module customization method.

3.1.1 Jython

The Jython language is the scripting language that the KM is written in, which can be used to write the KM in Jython, or to mix the Jython code in SQL, PL/C, and OS. It is because of the Jython language that the ODI has been significantly improved in programmability and can perform complex operations (such as manipulating strings, lists, accessing FTP, managing files, calling external Java classes, etc.) through Jython.

Jython is a Python programming language 100% pure Java implementation, which combines the benefits of Python and Java virtual machines and libraries as a useful complement to the Java platform.

The initial Jython, also known as Jpython, is a full Java application that allows the use of the syntax and most features of the Python programming language. As compared to other programming languages, Jython has several advantages:

The Jython version of the N Python interpreter shell can be used to facilitate experimentation and research on some ideas and APIs. Without having to go through the normal Java compilation run cycle.

N Python is designed to be dynamic and versatile, so you don't have to add these features by using complex libraries, such as libraries for Java reflection and introspection. This makes development easier, and it is particularly useful in automated testing frameworks.

n deployment is straightforward, providing the ability to deploy in a timely manner without spending a lot of time on packaging and compiling cycles.

n easy to learn and use, the technology threshold is low and can be easily found by a large number of users.

3.1.2 Substitution API

The KM contains a large number of substitution API methods that improve the reusability of the KM by invoking alternatives such as gettable (), Getobjectname (), instead of hard-coding the physical table names, schema names, and catalog into the KM. Prevents hard-coding problems. In addition, these substitution APIs provided by direct call to ODI also reduce the difficulty of code development and can be directly invoked to the relevant metadata information. Therefore, in the preparation of the KM, the rational use and invocation of the substitution API is particularly important, reasonable use of the substitution API will reduce the development of the KM, the KM quality assurance.

The methods in the Substitution API are written in Java, and the return value is a string, and the entire km gets the metadata information in the main workspace and the work repository by invoking the Substitution method. As the following table runs the example for a section of the substitution API, by invoking the methods in the API, the program code is obtained after the compilation is run.

Code

Km in code

(Pre-compilation)

Create table <%=odiref.gettable ("L", "Int_name", "A")%>

(

<%=odiref.getcollist ("", "/t[col_name] [Dest_cre_dt]", ",/n", "", "")%>

)

Via ODI

After compilation

Create table db_staging. I$_product

(

product_id Numeric (10),

Product_Name varchar (250),

family_id Numeric (4),

SKU varchar (13),

Last_date Timestamp

)

Detailed syntax rules for calling substitution in km are described in detail in the next section.

3.1.3 SQL

In ODI, most of the data integration work is done through SQL statements, such as the source data to the staging area, staging area data to the target operations, so there is a large number of SQL operations in the KM to do insert, Update, delete operations on the data. Read the KM carefully and you will find that most of the KM code is embedded in the SQL statement substitution Method, or in the Jython language to do process control, etc.; in operator, the compiled km is mostly a canonical SQL statement, so the KM is written, In the process of development, the writing of SQL is also a very important step.

For example, the following is the load data step in LKM SQL to SQL, which is done by embedding the substitution method in the overall framework of the SQL statement:

INSERT INTO <%=snpref.gettable ("L", "Coll_name", "A")%>

(

<%=snpref.getcollist ("", "[Cx_col_name]", ",/n/t", "", "")%>

)

Values

(

<%=snpref.getcollist ("", ": [Cx_col_name]", ",/n/t", "", "")%>

)

3.2 Development rules 3.2.1 Miles use rules in Jython

The Jython scripts can be run and compiled in ODI, such as lkm SQL to SQL (Jython), which is read by the Jython script tuning jdbc to complete the source data, and many other km are written through Jython, in addition to the use of Jython can be cited With an externally developed jar package, so that in addition to the substitution API that ODI comes with, the scope of its available APIs will be greatly expanded, and any Java-written API can be referenced through Jython, so adding Jython as a programming language in the KM will make KM more flexibility to enhance its scalability. Learning to write Jython scripts in the KM is particularly important, the following is a brief introduction to Jython's development rules, the detailed syntax for Jython can be referenced in the ODI provided by "Jython Quick Reference" or other Jython-related books.

There are a few places to note in using Jython in the KM

N Code execution:jython statements are executed sequentially and also support if, for,while,raise control structures

n Block:jython with rows as its block area, a block is identified with the same row alignment or space character

N Note: The comment for Jython begins with the character # and ends with the end of the physical line of this line

N Keywords: The following identifiers are reserved for Jython by default, and do not allow variables to declare

And Del for are raise assert elif from Lambda return break

Else global not a try class except if or while Continu exec

Import Pass def finally in print

N Data type: The data type of Jython supports number (Decimal integer,octal integer,hex integer,long integer,float,complex numbers), String, Lists, dictionaries, tuples, Sequence, files and so on.

n External class Loading: The external class is loaded with Jython in the KM, with the import command, and the external class file needs to be under the classpath path.

3.2.2 km substitution API call rules

Oracle Data Integretor provides a large number of well-developed Substitution APIs that are often called directly in development to obtain metadata information stored in the Repository, Substitution API can be called anywhere in the KM , which is referenced within the <%%> symbol, whose return value is a string, the specific development call rule is shown in table 3-1 below:

Options

Description

Call location

The Substitution API can be inserted at any location in the KM

Invocation mode

Called in <%%>, Substitution API's method is written in parentheses

Call syntax

Currently the substitution API method in the ODI 10.1.3.2.0 version is called with the Snpref start, and to invoke the substitution method of GetForm (), you only need to write Odiref.getform ().

Call time

The Oracle Data Integretor API is integrated from the Snpreference class, and the Snpref instance class can be obtained at any time when the ODI platform is running

API Method

There are many methods in Substitution API, such as global method, log method, loading method, checking method, integration method, reverse engineering method, Web service method, action method, and so on, the invocation of these specific methods is referenced by the ODI Substitution Methods Reference ", the use and parameters of each method are described in detail.

Table 3-1: Development Rules Table

Its common syntax is as follows:

"Free Text" "<% Java Expression%>"

Free Text: Anywhere in the code block

Java expressions: A Java expression that constructs a string.

Examples of Java expressions: Snpref.gettablename ("work_table") + "Future"

CKM's Special syntax:

The following syntax is used in IKM, which invokes the check procedure (CKM)

<% @include ckm_flow/ckm_static delete_errors%>

Ckm_flow: Select CKM according to the control of the interface, trigger the flow control.

Ckm_static: Triggers static control in the target data store. Check for constants or static constants in the data store.

Delete_errors: Automatic deletion of detected errors.

Use the permutation method in the action:

<% Odiref.method_name (); %>

For example: <% odiref.droprefferingfks (); %>

PS: Do not use "=" and end with a semicolon.

Actions are generally: add, drop, enable, disable, modify.

3.2.3 Guidelines (related guide)

In the KM modification, development, follow the following relevant guidelines, rules, will make the development process easier and more flexible, the following are some of our summarized experience, guidelines:

n do not develop your km from scratch, so the cost is too high. ODI has provided more than 100 km, so it is recommended to understand the existing km before writing the KM, the more the KM in hand, the faster the development. For example, you can replicate it by using some of the existing code to improve, enhance, or copy a similar piece of code and port it to a newly developed km for reuse.

n use as few hard-coded methods as possible in the KM, physical table names, schema names, and catalog are written to the KM as few hard-coded as possible, while API methods such as GetTable (), Getobjectname () should be used to enhance the versatility of the KM.

n Use SQL statements more in km, not all in Java or Jython, because SQL statements are easier to read and maintain.

N In addition, for the newly developed km, it should be named after the understanding of the name.

3.3 Development Platform 3.3.1 ODI development

For the development of the KM, it can be done directly on the ODI platform, the specific steps are as follows:

1, click into the designer→ project in the ODI → knowledge module

2, select the KM, and the km need to edit which step, such as the edited lkm File to DB2 UDB (load), then double-click to enter.

3. Select the KM steps you need to edit, click the detailed steps → specific steps, you can make the KM edit, you may also add or remove steps.

4, double-click a specific step, that is, enter the code interface, the KM to write.

5, for the well-written km, can only be run through the interface call, in the operator to see the code execution, check whether it is correct.

3.3.2 Eclipse Development

If you are writing a km directly inside the designer, and then for the interface to invoke and view the execution results from the operator, this approach is inefficient and cannot be judged by code errors. Therefore, it is necessary to choose a development platform to develop, compile and debug the KM directly on the platform.

The Open source plugin Jydt supports developing, debugging, compiling Jython code under the Eclipse platform, developing the KM under the eclipse platform, and then porting to the ODI, which will greatly improve the development efficiency, but the development of the KM in this way should also pay attention to the following issues:

n Eclipse JYDT only supports Jython development, and for a km mixed with the Subsititution API, eclipse will not be able to invoke the information stored in the repository directly

n A workaround for the above scenario is to run the KM in the interface first, get the KM compiled code in operator, and then port the code to eclipse for modification so that the methods in the Subsititution API have been compiled, Converted to actual physical configuration information

n cannot be edited for Km,eclipse assembled by a large number of SQL statements

This shows that the Eclipse development platform is suitable for those km directly written by Jython, so the development of the KM can be combined by two platforms, according to the actual situation to improve the development efficiency.

= = End of document = =

This article transferred from: Http://dangdj.spaces.live.com/blog/cns!EDE097CEA39ABC75!186.entry

Two-time development manual

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.