SSIS data stream component development (1)

Source: Internet
Author: User
Tags ssis
Microsoft calls data stream technology a pipe (Pipeline) Technology, data flow components can also be called pipeline components. Data can be visually viewed as a flow of water flowing in the pipeline. Each data flow component receives the data delivered by the upstream component, processes the data, and then delivers the data to the downstream component.

Component metadata

You can right-click a component and select"Show advanced Editor"To view the metadata of the component.

 

Not all components can display the advanced editor, and many components, even in the advanced Editor, we cannot modify the predefined metadata at will. For example, add or delete an output to or from a component. The methods for protecting metadata are described in the implementation of component design.

Input and output are the most basic metadata of a component. Each component can have several outputs and inputs. The input is used to receive the data output by the upstream component. After the component completes data processing, output Data to downstream components. The source component only has outputs, and the target component only has inputs.

The output of the upstream component is connected to the input of the downstream component through a path. The green directed line in the figure is the path. You can add a data viewer to the path to view the data transfer process during execution.

 

Another important metadata is the Connection Manager, which is required for the source and target components to connect to the Manager. You can use the Connection Manager to read or write data to external databases.

Data Conversion Type

There are three ways to transmit data from one component to another:

1.Line conversion-indicates that the component receives a row of data and then processes the data before outputting the data. Data rows are not created or deleted during the entire data processing process of the component. For example, a component receives100Row data. Each time a row of data is received, the row data is handed over to the downstream component after processing.100After the row data is processed, it is handed over to the downstream component.

2.Semi-blocking conversion-indicates that the component controls data for a period of time. For example, component receiving100Line of data, it needs10The row data is summarized once, and the final output is10Data summarized by rows. Unlike line conversion, after receiving a row of data, it does not process the output immediately, but saves the data temporarily.10After processing the row of data, process the output row of data.

3.Full blocking conversions-similar to semi-blocking conversions, but full blocking controls all data. Processing100Semi-blocking10Output data. Full blocking only receives all100The output is processed only after the row data is processed. Such as the sorting component, the sorting order can be determined only after all rows are received.

Synchronization/Asynchronous output

Corresponding to the data conversion type, the component output is also divided into two types: synchronous and asynchronous.

The output of the row conversion component is synchronized. Open an advanced editor for the derived column component.IdentificationstringYes: Input"Derived column Input"(325).

Let's look at the outputSynchronousinputidIs the inputIdentificationstring. This indicates that the output is synchronized with the input. Each time the input receives a row of data, the output passes a row of data to the downstream component.

The output of semi-blocking and fully-blocking components is asynchronous.SynchronousinputidIs null.

Buffer

BufferIs the container used to load data. After the source component obtains data from external data, the data is loadedBufferStored in memory. A data stream can contain multipleBuffer,BufferIt is not exclusive to a single component, but shared by a cluster of components,BufferThe numberBufferCapacity is determined.

 

Each asynchronous output in the data stream creates a newBuffer. In4Components, the source component and the sorting component have asynchronous output, so there are twoBuffer.Buffer1Contains the columns of the source component and the derived column component. Assume that the data read by the source component from the external data source is3Column, and the derived column component is derived2Column, thenBuffer1Define5Columns. Sorting component acceptedBuffer1After the data,Buffer1The lifecycle is over, and the sorting component will be createdBuffer2.

BufferThe capacity is limited. You can set data flow tasks.DefaultbuffersizeAttribute to specifyBufferThe maximum size100 MB.DefaultbuffermaxrowsYes.BufferThe maximum number of rows.MinbuffersizeAndMaxbuffersizeAdjustment.

If the size required by the component exceeds oneBuffer, MultipleBuffer.

 

after learning about the above concepts, You can Develop data flow components . To further understand the data flow execution plan, see http://technet.microsoft.com/en-us/library/ms136012.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.