PB Data Pipeline

Source: Internet
Author: User

Data Pipeline provides a method for transferring data and/or table structures between different databases.


Data Pipeline object
To complete the data pipeline function, you must provide the following content:
The data source and target database are required and can be normally connected to the two databases.
Tables in the source database;
Where to copy the data to the target database;
MPs queue operations;
The frequency of errors;
Maximum number of consent errors;
The Extended features that need to be included.


Data Pipeline canvas
New-> database-> Data Pipeline
Data Pipeline options:
Table: name of the target table.
Key: primary key name of the target table (non-primary key field name)
Options: MPS queue options
Create: join? Table and replace: replace existing tables, refresh: delete or insert rows, append: Insert rows, update: update or insert rows
Max errors: Maximum number of errors agreed
Commit: number of rows in the transaction to be committed
Extended attributes: whether the extended feature is pipeline-based

Design-> database bolb processes BLOB fields

Assume that you only create a data table in the target database without copying the original data. You can define a retrieval condition that is never valid. For example, 2 <1 is used as the retrieval condition.


Data Pipeline user object
New-> Pb object-> standard class-> OK-> select pipeline-> OK


Attribute:
Dataobject: Specifies which data pipeline to use
This attribute is the most important attribute of a Data Pipeline. It is similar to the dataobject attribute of a Data Form Control. This attribute can only be set during execution. Although this attribute is also provided in the user object canvas, it does not seem to work. The following is an example of setting dataobject for a user object in the pipeline:
Iuo_pipeline.dataobject = "p_copy_employee"
Assume that the dataobject attribute of the Data Form Control is dynamically modified during script running, the corresponding data form object cannot be contained in the running file when the file is released. Similarly, the Data Pipeline object cannot be included in the running file, nor in the resource file PBR. It can only be placed in the PBD or DLL file.

Rowsinerror: Number of error records
Long TYPE, indicating the number of errors that have occurred during pipeline execution. This value is certainly not greater than the value of maxerrors defined in the pipeline canvas.

Rowsread: number of records read
Long indicates the number of data records that have been read during pipeline execution. This value is certainly not greater than the value defined in commit in the pipeline canvas.

Rowswritten: number of records successfully written
Long TYPE, indicating the number of data records that have been written to the database

Syntax: Pipeline object syntax
This attribute is also a very important attribute. It reflects all the definitions of the pipeline and can be used as a pipeline to some extent. This attribute can be used to construct a Data Pipeline with strong versatility. Because string processing functions such as mid, POs, Len, left, and right can be used to modify the Syntax of the Data Pipeline, to adapt to the execution of the program.
You can select a user object in the MPs queue. Right-click and choose edit source to view the syntax.


Event:
In addition to two standard events constructor and destructor, there are also three events unique to the Data Pipeline user object, which are pipestart, pipemeter, and pipeend. These three events are triggered when the pipeline is running, during running, and after running. These three events and some attributes of the MPs queue are usually used to reflect the running progress.

The pipestart event is triggered after the Pipeline Function start () or repair () is called. The pipemeter event is triggered after a transaction is submitted for processing. Generally, when the number of records processed reaches the value specified by the commit value, the pipeend event is triggered when the start or repair function stops running.


Function:
Start:
Pipelineobject. Start (sourcetrans, destinationtrans, errordatawindow {, arg1, arg2,..., argn })
Pipelineobject is a user object type variable in the pipeline, and its dataobject attribute has a clear value before running the function; sourcetrans is the transaction object connected to the source database, destinationtrans is the transaction object used to connect to the target database. Both transaction objects should be correctly connected to the corresponding database before this time. When the pipeline is running, assume that an error occurs, its error information is displayed in the errordatawindow Data Form Control. There is no need to specify a data form object for the control, even if the error message is displayed, the data form object is replaced. The preceding three variables must be specified. Otherwise, the function cannot work normally. The subsequent variables depend on whether the data pipeline defines the search variables. Assume that the data pipeline defines the search variables, arg1, arg2,..., and argn to be specified, and the number of variables is the same as the number of search variables, and the corresponding type is the same. Assume that the pipeline object defines the search variable, but does not provide the corresponding value in the Start function, the system will pop up a dialog box when the pipeline is running, requiring you to enter the search variable. The returned values during function execution are more complex than the values, as shown in the following figure:
Meaning of Return Value
1. The function runs successfully.
-1 failed to open the MPs queue.
-Too many columns in 2
-3 The target table already exists.
-4 The target table does not exist.
-5 connection errors
-6 variable retrieval Error
-Column 7 does not match
-8 fatal SQL errors in the source
-9 fatal SQL errors in the target
-10 exceeds the maximum number of errors
-12 Table syntax errors
-13 the required primary key is not provided.
-15 the MPs queue operation has been performed.
-16 errors in the source database
-17 errors in the target database
-18 the target database is read only.


Repair:
Run the MPs queue in the script. After an error occurs, it must be processed. The repair function should be run after an error occurs in the pipeline.
After an error occurs, the error message and error data are displayed in the Data Form Control specified by the START function errordatawindow variable. The user can submit data again after the modification error, and then call the repair function. This is equivalent to using the small icon buttonupdate dB in the painter bar after an error occurs in the Data Pipeline canvas, or using the menu item design-> Update database.

Pipelineobject. Repair (destinationtrans)
Pipelineobject is a user object type variable in the pipeline. Before running this function, its dataobject attribute has a clear value. destinationtrans is a transaction object that establishes a connection with the target database. The Return Value of the function is also more complex than the limit, as shown in the following figure:
Meaning of Return Value
1. The function runs successfully.
-5 connection errors
-9 fatal SQL errors in the target
-10 exceeds the maximum number of errors
-11 invalid form handle
-12 Table syntax errors
-15 the MPs queue operation has been performed.
-17 errors in the target database
-18 the target database is read only.


Cancel:
This function can terminate a running pipeline and call this function when you want to force Exit A data pipeline.

Pipelineobject. Cancel ()
If the function runs correctly, 1 is returned. Otherwise,-1 is returned.


Execute Data Pipeline

Directly execute
Use the execute icon button in the menu item design-> execute or painterbar to execute the current data pipeline. When an error message occurs during MPs queue execution, you must modify the MPs queue definition based on the error message and execute the task again.
If a search variable is specified to extract the source table data, you are required to enter a value at runtime, and then use this value to retrieve the data that meets this condition in the source data form. If the number of shards is not defined, data is directly read from the source table.

Run pipelines in programs
Take the following steps:
Create related objects
Establish a connection with the source database
Create a data pipeline and set its related properties
Run the MPs queue and handle various exceptions.

To use a Data Pipeline in a script, you must create three objects: The data pipeline object, the Data Pipeline user object, and a data form object that saves error information, the data form object is automatically created by the system when an error occurs in the pipeline execution. The data pipeline object is created in the Data Pipeline drawing board, and the data pipeline user object is created in the standard class user object.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.