[Knowledge sharing] Typical SSIS applications

Source: Internet
Author: User
Tags ssis

Visual Basic(Statement) C # C ++ J # JScript

Integration Services provides a series of business applicationsProgramDeveloped built-in tasks, containers, transformations, and data adapters. You do not need to write a lineCodeYou can create an SSIS solution to use ETL and business intelligence to solve complex business problems, manage SQL Server databases, and copy SQL Server objects between SQL Server instances.

The following describes the typical use of the SSIS package.

Merge data from heterogeneous data storage areas

Data is usually stored in many different data storage systems. It is difficult to extract data from all sources and merge it into a single consistent data set. There are multiple reasons for this situation. For example:

    • Many organizations need to archive information stored in the early data storage system. These data may not be important in daily operations, but it is important for trend analysis that needs to collect data for a long period of time.
    • Different departments of an organization may use different data storage technologies to store operational data. Packages may need to extract data from workbooks and relational databases before they can be merged.
    • Data may be stored in databases that use different architectures for the same data. The package may need to change the data type of the column or combine the data of multiple columns into one column before merging the data.

Integration services can be connected to a variety of data sources, including multiple sources in a single package. Packages can be connected to relational databases using. NET and ole db access interfaces, or ODBC drivers can be used to connect to multiple early databases. Packages can also be connected to flat files, Excel files, and analysis services projects.

Integration Services includes some source components that extract data from flat files, Excel workbooks, XML documents, and tables and views in relational databases connected to the package.

Then, data is usually converted using the conversion function included in integration services. After the data is converted to a compatible format, it can be physically merged into a data set.

After the data is successfully merged and the application is converted, it is usually loaded to one or more targets. Integration services includes the targets used to load data to flat files, raw files, and relational databases. Data can also be loaded to the record set in the memory for access by other package elements.

Fill data warehouse and data mart

Data in data warehouses and data marketplaces is often updated frequently, so the data loading volume is usually large.

Integration Services contains a task that can directly load data from a flat file to SQL Server tables and views. It also contains a target component, this component can load large data capacity to the SQL Server database at the last step of the data conversion process.

The SSIS package can be configured to be restarted. This means that the package can be re-run from a predefined checkpoint (a task or container in the package. Restarting a package saves a lot of time, especially when the package needs to process data from a large number of sources.

You can use the SSIS package to load dimension tables and fact data tables in the database. If the source data of a dimension table is stored in multiple data sources, the package can merge the data into a dataset and load the dimension table in a single process, instead of using a separate process for each data source.

Updating data in a data warehouse or a data mart may be complicated because these two types of data storage areas generally contain gradient dimensions that may be difficult to manage through the data conversion process. You can dynamically create SQL statements used to insert and update records, update related records, and add new columns to a table. Therefore, the gradient dimension wizard automatically supports gradient dimensions.

In addition, tasks and transformations in the integration services package can process analysis services multidimensional datasets and dimensions. After updating the tables in the database where the multi-dimensional dataset is created, you can use integration services tasks and transformations to automatically process the multi-dimensional datasets and dimensions. Automatic Processing of multidimensional datasets and dimensions helps users in the following two environments to always obtain the latest data: users who access information in multidimensional datasets and dimensions, and users who access data in relational databases.

Integration services can also compute functions before data is loaded to its target. If the data warehouse and data mart store aggregate information, the SSIS package can calculate functions such as sum, average, and count. SSIS conversion can also view the relational data and convert it to a non-standard format to better be compatible with the table structure in the data warehouse.

Clear data and standardize data

Whether the data is loaded to an online transaction processing (OLTP), Online Analytical Processing (OLAP) database, an Excel spreadsheet, or a file, you must clear and standardize the data before loading. Data may need to be updated for the following reasons:

    • Data is provided by multiple departments in one organization, and each department uses different conventions and standards. You may need to process the data in different formats before using the data. For example, you may need to combine the name and surname into a column.
    • Data is rented or purchased. Data may need to be standardized and cleared to meet business standards before they can be used. For example, the Organization needs to verify that all records use the same status abbreviation set or the same product name set.
    • Data is region-specific. For example, data may use different date/time and value formats. If you want to merge data from different region settings, you must first convert the data to the same region before loading the data to avoid data corruption.

Integration Services includes some built-in conversions, you can add it to a package to clear and standardize data, change the case sensitivity of data, convert data to different types or formats, or create new column values based on expressions. For example, a package can concatenate the last name column and the name column into a single full name column, and then change the character to uppercase.

The integration services package can also use exact or fuzzy search to locate the value in the reference table, and clear the data by replacing the value in the column with the value in the reference table. Generally, the package uses exact search first. If the search method fails, fuzzy search is used. For example, the package first tries to use the product's primary key value to find the product name in the reference table. If the product name cannot be found in this search, the package then tries to use the product name fuzzy match method for search.

Another type of conversion is to group similar values in a dataset to clean up the data. Some records may be repeated, so they should not be inserted into the database without further computation. This type of conversion is useful for identifying such records. For example, you can identify many duplicate customers by comparing the addresses in the customer records.

Putting business intelligence into the data conversion process

The data conversion process requires built-in logic to dynamically respond to the data accessed and processed by the data.

Data may need to be summarized, converted, and distributed based on data values. Based on the evaluation of the column value, this process may even need to reject data.

To meet this requirement, the logic in the SSIS package may need to execute the following types of tasks:

    • Merge data from multiple data sources.
    • Compute data and apply data conversion.
    • Splits a dataset into multiple datasets based on the data value.
    • Apply different aggregates to different subsets of a dataset.
    • Load a subset of data to different or multiple targets.

Integration Services provides containers, tasks, and transformations for placing business intelligence into SSIS packages.

Containers support repeated Workflow running through enumeration files, objects, and computing expressions. The package can compute data and run the workflow repeatedly based on the results. For example, if the date is in the current month, the package executes a group of tasks. If not, the package executes another group of tasks.

Jobs that use input parameters can also put business intelligence into the package. For example, the value of input parameters can be used to filter the data retrieved by the task.

Conversion can calculate the expression, and then send the rows in the dataset to different targets based on the results. After data division, packages can apply different conversions to each subset of a dataset. For example, an expression can calculate the date column, add sales data for the corresponding period, and then only store summary information.

You can also send a dataset to multiple targets and apply different conversion sets to the same data. For example, a group of transformations can summarize this data, while another group of transformations can expand this data by looking for values in the referenced table and adding data from other sources.

Automate management functions and data loading

Administrators often want to automate management functions, such as backing up and restoring databases, copying SQL Server databases and their objects, copying SQL Server objects, and loading data. The integration services package can perform these functions.

Integration Services includes tasks designed for the following purposes: copying SQL Server database objects, such as tables, views, and stored procedures; Copying SQL Server objects such as databases, logon, and statistics; use a Transact-SQL statement to add, modify, and delete SQL Server objects and data.

The management of the OLTP or OLAP database environment usually includes data loading. Integration Services includes several tasks to facilitate large data loading. You can use a task to directly load data in text files to SQL Server tables and views, you can also use the target component to load data to the SQL Server table and view after applying the conversion to the column data.

The integration services package can run other packages. Data conversion solutions that contain multiple management functions can be divided into multiple packages, making it easier to manage and reuse packages.

If you need to perform the same management function on different servers, you can use the package. The package can enumerate servers cyclically and perform the same functions on multiple computers. To support SQL server management, integration services provides an enumerator that can traverse objects managed by SQL Server (SMO. For example, you can use the SMO enumerator to installJobsEach job in the Set performs the same management function.

In addition, you can use the SQL Server proxy job to schedule the SSIS package.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.