Notes for SSIs Creation

Source: Internet
Author: User
Tags ssis

1. The integration services project has a lot of control flow items. By combining the execution relations between them, we can complete some work. Several commonly used items are scirpt task, SQL task, dataflow task, and execute.
Process Task

2. You can right-click the design page of control flow to add variables. to access global variables, the script task must be pre-defined in readonlyvariables and readwritevariables during editing, then, click the motor design script button to edit the script.

3, datafow task item

In the integration services project, once a DATa flow task, which generates new data streams. Just as controller flow processes the main workflow of the package, data streams process data conversion. Almost all components that can operate on data can be classified into data flow categories. When data flows in each step of the data stream, the data changes according to the converted content. As shown in figure 1-7, a new column is generated after conversion using the derived column, and the new column is available for subsequent conversion or destination.

This section briefly describes each data source, destination, and conversion. Chapter 1 and Chapter 4 will provide a more detailed introduction to this section.

1.6.1 Source

The source is the data source location specified by the user. The data is pulled from the source to the Data Pump. The source usually points to the Connection Manager in SSIs ). By pointing to the Connection Manager, you can reuse the connection in the entire package, because you only need to modify the connection in one location. There are 6 sources available in SSIs.

● Ole db Source: connects to almost all ole db data sources, such as SQL Server, access, Oracle, or DB2.

● Excel Source: the Excel source is used to receive data from Excel workbooks. This kind of data source can also make it easier to run SQL queries for Excel workbooks to limit the data range that you want to use data streams.

● Flat file source: connects to a bounded or fixed-width file.

● Raw file source: a special file format created by the raw file destination. The source of the original file usually indicates the data during transmission, and the reading speed is very fast.

● XML Source: You can obtain data from XML documents.

● Datareader Source: The datareader source is an ADO.. net connection, similar to connecting to the database using the datareader interface in your application code.. NET Framework.

1.6.2 destination

In a data stream, the destination receives data from the data source and from the conversion process. The flexible architecture of SSIS can send data to almost any ole db-compatible data source or to a flat file. The same as the source, the destination is managed through the Connection Manager. In SSIs, you can use the following types of destinations:

● Data Mining Model Training (DATa Mining Model Training): This type of destination trains an Analysis Service Mining Model by passing data from the data stream to the destination.

● Datareader destination: allows you to submit data to other external processes, such as reporting services or your own. NET applications. This destination uses the ADO. Net datareader interface to complete this task.

● Dimension processing: loads and processes an analysis service dimension. It can perform full or incremental updates of dimensions.

● Excel destination: outputs data from data streams to an Excel worksheet.

● Flat file destination: allows you to write data to a bounded or fixed-width file.

● Ole db destination: outputs data to an ole db data connection, such as SQL Server, Oracle, or access.

● Partition processing (partition processing): enables you to perform incremental update or full update of Analysis Service partitions.

● Raw file destination: This type of destination is used to output data that can be used for the source of the original file later. This is a very professional format, and the output is very fast.

● Recordset destination: writes records to the ADO record set.

● SQL Server Destination: This destination can be used to write data to SQL Server most effectively.

● SQL Server mobile destination: inserts data into the SQL server running on a Pocket PC.

1.6.3 Conversion

Transformation is a key component of data streams and is responsible for converting data into desired formats. For example, you may want to sort or aggregate data. You can use two conversions to complete this task. In SSIs, the biggest benefit of the conversion is that all processes are completed in the memory, and you do not need to perform the meticulous Script Programming as required in SQL Server 2000 DTs. The conversion will be discussed in detail in chapter 4th and Chapter 6th. The complete conversion list available in SSIS is as follows:

● Aggregate: aggregates data from the conversion process or source.

● Audit (Audit): this switch provides audit information to the package. Such audit information includes information such as when and who runs the software.

● Character ter ing table (Character Map): This conversion can be used to convert character data. For example, the data can be converted from small to uppercase.

● Conditional split: splits data based on the specified conditions. For example, this type of conversion can be set to send data along different paths when the State (State) column is equal to Florida (Florida.

● Copy column: adds a copy of a column to the conversion output. Replicas can be converted later, while original data is retained for review.

● Data Transformation (DATa conversion): converts the Data Type of a column to another data type.

● Data Mining Query (DATa mining query): executes data mining queries for the Analysis Service.

● Derived column: Creates a new derived column calculated based on the expression.

● Export column: This conversion allows you to export a column from a data stream to a file. For example, you can use this conversion to write a column containing an image to a file.

● Fuzzy grouping: clears data by searching for rows that may be duplicated.

● Fuzzy lookup: matches and standardizes data based on fuzzy logic. For example, this conversion can convert the name Jon to John.

● Import column (IMPORT column): reads data from a file and adds it to a data stream.

● Lookup: queries the data to be used in the conversion process later. For example, you can use this type of conversion to query a city by zip code.

● Merge: combines two sorted datasets in the data stream into a single dataset.

● Merge join: A join function is used to merge two datasets into a single dataset.

● Multicast: sends a copy of the data to the path added in the workflow.

● Ole db command: executes an ole db command for each row in the data stream.

● Percentage Sampling: capture data samples from the data stream by using the percentage of the total number of rows in the data stream.

● Pivot: pivot data on a column to an unrelated format. Pivot a table means that data can be divided in multiple ways, which is similar to OLAP and excel.

● Row count: saves the number of rows from the data stream to the variable.

● Row sampling: capture data samples from the data stream by using the total number of rows in the data stream.

● Script component: converts data using scripts. For example, you can use this type of conversion to apply specific business logic to data streams.

● Slowly changing dimension: coordinates the condition insertion or update of data in a slowly changing dimension.

● Sorting (SORT): sorts data in a data stream using a given column.

● Term Extraction: extract nouns or adjectives from text data.

1.6.3 script component

You can use scripts to set up flexible modification of input and output, and modify the output by modifying the column value of the row. note that if you want to modify variable in script component, You need to override the method. you can also use directrowtooutput to filter row by setting exclusiongroup rather than 0.

Public overrides sub fileinput_processinputrow (byval row as fileinputbuffer)

If row. col001 <> "HDR" and row. col001 <> "LR" then

Row. crttime = now

Row. filename = me. variables. filename

If row. col002 = "y" or row. col014 = "N" then

Row. chequefrm = "0"

Else

Row. chequefrm = "1"

End if

Row. poststatus = "P"

Row. splittktflag = row. col001



Row. directrowtooutput ()

Value = "ABCD"

End if

End sub

'Override postexecute method to update the variables

Public overrides sub postexecute ()

Variables. CLR = Value

Mybase. postexecute ()

End sub

1.7 Connection Manager

After a Connection Manager is created, you can use the script to modify the connection string to reset the connection. Pay attention to the format of the connection string according to native SQL and Ole net types,

● Term lookup: searches for terms extracted from text and then references values in a referenced table.

● Union all: combines multiple datasets into a single dataset.

● Unpivot: converts data from a non-standard format to a relational format.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.