Fifth integration Services: Incremental load-deleting Rows

Source: Internet
Author: User
Tags ole ssis

This article is the fifth chapter of the Integration Services series, please refer to the original text for details.

In the previous article you learned how to transfer updates from a source to a destination. You also learned to use collection-based updates to optimize this feature.
Review incremental load
Remember, there are three use cases in SSIS incremental loading:
1. New rows-add rows to the destination that has been added to the source since the previous load.
2, Updated rows-update rows in the destination that has been Updated in the source since the previous load.
3. Deleted rows-remove rows from the destination, that has been Deleted from the source.
In this article, we will look at the deleted rows in the SSIS incremental load feature-the rows that have been removed from the source since they were loaded into the target.
Delete a missing row
Let's set up the test conditions. The thinking about this is that the row exists in the target table but does not exist in the source table. Similar to inserting a row from the source to the target, and then deleting the row from the source. Open SSMs and execute the following T-SQL statement:

 use   AdventureWorks2012  go  insert  into   dbo. Person (Firstname,middlename,lastname,entityid)  values  ( '    leonard   ' , 

If you perform an incremental load delete logic, the missing rows are detected in the source and removed from the target. Now we have a great opportunity to test the impact of an additional row of records on the incremental load logic. Press the F5 key to execute the SSIS package. Check the control flow and data Flow task error messages. Control flow without any error messages, my traffic is as follows:

Figure 5.1 Data Flow Task
This is what I wanted: the incremental load ran successfully, no new rows were detected, or rows were updated. Let's take a final look at the transformation operation. Click Progress tab to find the number of records converted to the cache:

Figure 5.2 The number of Lookup transformation records cached in progress in the
source table (Person.person) has 19972 rows, and the lookup transformation is from the target table (dbo. person) returns 19973 rows, and the additional line is found and cached by the lookup, but it has no effect on existing insert and update data flow logic.
Before we add a new component to the SSIS package, let's rename the Data Flow task to "Insert and Update Rows." Add a Data Flow task to the control flow canvas and use the arrows to connect the "Apply staged Updates" Execution SQL task and the new data Flow task. The Rename Data Flow task is "Delete rows":

Figure 5.3 Add Delete rows Data Flow task
Open Delete rows Data Flow task. Add an OLE db source, open source and configure properties:
->ole DB connection Manager: 127.0.0.1,5377.ADVENTUREWORKS2012.SA
Data access mode: Table or view
Name of the table or view: dbo. Person

Figure 5.4 Configuring the OLE DB source
in this data Flow task we use the target table (dbo. person) Job source. Next we use a Lookup transformation to detect that a row exists in the source but not in the destination (source, target relative to this data Flow task). This is exactly the same process as detecting a new row in a data Flow task. Drag and drop a Lookup transformation to the Delete rows Data Flow task, and then connect the OLE DB source and find:

Figure 5.5 Add find and connect source and find
Open Lookup Transformation Editor, in general page modify specify how to handle no match behavior "redirect rows to unmatched output" :

Figure 5.6 Modify the unmatched row output
on the Connection page, the OLE DB connection Manager selects 127.0.0.1,5377.adventureworks2012.sa. Tick the result of using SQL query and type the following T-sql:

Select  as EntityId    from Person.person

We only need to entityid the identification column. Click the Columns page to drag and drop the EntityId column of the available input columns to the EntityId column of the available lookup columns:

Figure 5.7 Lookup Transformation Column Mappings
Because we configured find redirect unmatched rows to no match output. If the source (dbo. person) and the lookup table (Person.person) have no matching rows that are routed to the unmatched output. Click the OK button to close the Lookup transformation Editor.
How do we complete the delete operation? I can do this by adding an OLE DB Command transformation (manipulated in the update logic), but I dare say you already know what to do next. Let's skip the middle section, add an OLE DB target directly to the delete rows Data Flow task, and then connect the unmatched output of the Lookup transformation to the OLE DB destination:

Figure 5.8 Adding an OLE DB destination
Rename the OLE DB target to Stagedeletes, and then right-click to open the editor. As before, next to the Name drop-down list in the table or view, click New button. It should contain the following DDL commands:

CREATE TABLE [stagedeletes] (    [EntityId] int,    [FirstName] nvarchar( -),    [LastName] nvarchar( -),    [MiddleName] nvarchar( -))

Click OK to close the Create Table window (at this point it will create the Stagedeletes table) and click on the map page to complete the OLE DB Destination column mapping:

Figure 5.9 OLE DB Destination column mappings
Click OK to close the OLE DB destination editor. As before, we need to be in the source table (dbo. Person) applies this deletes collection. Add an Execute SQL task to the control flow and connect the green arrows on the Delete rows Data Flow task to the new Execute SQL task:

Figure 5.10 Adding the Execute SQL Task component
Open the Execute SQL Task Editor and modify the following properties:
->name:apply staged deletes
->connection:127.0.0.1,5377.adventureworks2012.sa
->sqlstatement:

Delete dest    from dbo. Person dest  join  stagedeletes stage    on= dest. EntityId

Click the OK button to close the Execute SQL Task Editor. Let's test this SSIS task individually, right-click on the Apply staged deletes execute SQL task and click on perform task:

Figure 5.11 Performing a separate SQL task
If we are configured correctly it executes successfully. Stop the bids debugger. In the collection-based update logic, we need to process the Stagedeletes table. Before each load to the data Flow task we need to clear it, two times the record in the load is kept in case we need to view. There is no need to create a separate execute SQL task for it. We already have a truncate mission. Click Open Editor to add a statement below the previous statement:

Truncate Table Stagedeletes

The input SQL query should resemble the following:

Figure 5.12 Entering a SQL query
Click the OK button to close the Input SQL query window and modify its Name property to truncate Stageupdates and stagedeletes before closing the Execute SQL task. Click OK and press F5 to execute the SSIS package:

Figure 5.13 Execution of a successful control flow canvas
Look good, view the delete rows Data Flow task:

Figure 5.14 Delete rows Data Flow Task
To this individual line already from the target table (dbo. person) to delete. We can perform the following statement validation in SSMS:
Returns 0 indicating that the record has been deleted.
Summarize
We did a lot of this article and perfected our first SSIS package--Incremental loading. We learned some of the introductory ETL test examples, and we learned to find transformations and execute SQL tasks. We designed a repeatable SSIS package, combined with the concept of incremental loading, which can be executed once a month, applying all the changes from the last one months to the target, or every 5 minutes to the time the changes are applied to the target.

Fifth integration Services: Incremental load-deleting Rows

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.