Using the CDC in SSIS 2012 (data Change capture)

Source: Internet
Author: User
Tags create database ssis

With a few gaps in the latest projects, starting with some of the bi features of SQL Server 2012 and 2014, referring to an example from Matt, we started to experience the CDC in SSIS (change data Capture).

Note: If you need to know about the CDC in SQL Server 2008, see here http://blog.csdn.net/downmoon/article/details/7443627), this article assumes that readers have an understanding of how the CDC works. ^_^.

We complete the example in three steps:

1, prepare the basic data;

2, the design of an initial package;

3, on the basis of 2 design an incremental package.

First, complete the following ready installation:

(1) Visual Studio 2012 or Visual Studio Shell (Isolated) Redistributable Package

http://www.microsoft.com/en-us/download/details.aspx?id=30678

http://www.microsoft.com/en-us/download/details.aspx?id=30670

(2) SQL Server Data tools-business Intelligence for Visual Studio 2012

http://www.microsoft.com/zh-cn/download/details.aspx?id=36843

(2) SQL Server 2012 Enterprise Edition or Development edition

http://www.microsoft.com/en-us/download/details.aspx?id=29066

(3) Sample Database AdventureWorksDW2012 (This article must, if you do not need to build the table)

http://msftdbprodsamples.codeplex.com/releases/view/55330

Okay, first step:

/*-
=============================================-
Create test database and datasheet, with AdventureWorksDW2012 sample database
--- Generate by Downmoon (invitation month), 3w@live.cn
-=============================================
* * *
--create Database cdctest
--go
--use [cdctest]
--go--select

* into DIMCUSTOMER_CDC--from
[ ADVENTUREWORKSDW2012]. [dbo]. [DimCustomer]
--where Customerkey < 11500;

--select * from DIMCUSTOMER_CDC;

/*-
=============================================-
enables the database-level CDC, only valid for enterprise and Development Editions
---Generate by Downmoon ( Invitation month), 3w@live.cn
-=============================================
* *
 [cdctest]

Go EXEC sys.sp_cdc_enable_db
Go

--Add a primary key to the DIMCUSTOMER_CDC table so we can enable support for net C Hanges
IF not EXISTS (SELECT * from sys.indexes WHERE object_id = 
object_id (N ' [dbo].[ DIMCUSTOMER_CDC] and name = N ' PK_DIMCUSTOMER_CDC ')
  ALTER TABLE [dbo].[ DIMCUSTOMER_CDC] ADD CONSTRAINT 
[PK_DIMCUSTOMER_CDC] PRIMARY KEY CLUSTERED 
(
    [Customerkey] ASC
) Go/

*-=============================================-
enable table level CDC
---Generate by Downmoon (Invited month), 3w@live.cn
-=============================================/
EXEC sys.sp_cdc_ enable_table 
@source_schema = n ' dbo ',
@source_name = n ' dimcustomer_cdc ',
@role_name = n ' cdc_admin ',
@supports_net_changes = 1

Go

/*-
=============================================-
creates a target table with the same table structure as the source table
-Note that in a production environment, Can be a different instance or server, in this case, for convenience, demonstrate
---Generate by Downmoon (invitation month) in the same database instance of the same database, 3w@live.cn
-===================== ========================
/
SELECT top 0 * to dimcustomer_destination from
DIMCUSTOMER_CDC
-- SELECT @ @version;
SELECT * from Dimcustomer_destination;

Step Two: Create the initial package

--=============================================--
We use two packages to complete the sample, an initial package completes the initial loading of the data, and an incremental package completes the change capture of the data
--- Generate by Downmoon (invitation month), 3w@live.cn
--=============================================


The initial package contains the following logic:
(1) Initial load start LSN using CDC control task token (use the CDC control task to mark the initial load start LSN)
(2) Convert all source table data to the target table (Transfer all of the "data" from the "source table into" Our destination table)
(3) Initial load ending LSN using CDC control task token (use the CDC control task to mark the initial load end LSN)

Example: HTTP://CODE.MSDN.MICROSOFT.COM/MY-FIRST-INTEGRATION-FA41C0B1

Create a new SSIS project, creating a package "Initial Load", as shown in the following figure:

New two CDC control task, named "CDC control task Start" and "CDC control Task", respectively, corresponding to the attribute "Mark Initial load STA RT "and" "Mark Initial Load End"

Connection Manager is Ado.net, and the other properties are as follows:

Add a "Data flow Task" in the middle, and the property defaults.

At this point, run the package, visible cdc_states has the initial tag.


Step three: Create an incremental package

The delta package contains the following logic:
(1) Create a Connection manager for the source database (Create a Connection manager for the "source")
(2) Set the CDC operator to obtain the processing boundary (set the CDC control Operation to get processing range)
(3) Creation of a new CDC state variable (cdc_state) (Create a, new, variable (cdc_state))
(4) Create a Connection manager for the target database (create a Connection manager for the destination database)
(5) Select the state table created by the previous initial load package (this is created by the Initial load package) –[dbo].[ Cdc_states])
(6) Set the status name (must match the state name used by the initial add-in package, this must match what is used in the Initial load package (cdc_state))

Create a new package in the project named "Incremental Load"

In the control flow view of the package, manually 6 tasks from top to bottom, in the order shown below, to remove the three tasks used above and the rest are execute SQL task

Note: The CDC operator for CDC control task end is the MARK process range, and CDC control task start's CDC operator is get Process range

The SQL statements for the remaining 4 execute SQL tasks are as follows:

--create stage Tables
IF not EXISTS (SELECT * from sys.objects WHERE object_id = object_id (N ' [dbo].[ Stg_dimcustomer_updates] ') and type in (N ' U '),
BEGIN
   SELECT top 0 * to Stg_dimcustomer_updates
   from Dimcustomer_destination
End

IF is EXISTS (SELECT * from sys.objects WHERE object_id = object_id (N ' [dbo].[ Stg_dimcustomer_deletes] ') and type in (N ' U '),
BEGIN
   SELECT top 0 * to Stg_dimcustomer_deletes
   from Dimcustomer_destination End

--Batch update
update dest
SET 
    dest. FirstName = STG. FirstName, 
    dest. MiddleName = STG. MiddleName,
    dest. LastName = STG. LastName, 
    dest. Yearlyincome = STG. Yearlyincome 
    from [dimcustomer_destination] dest, 
    [stg_dimcustomer_updates] STG
WHERE 
    Stg.[customerkey] = dest. [Customerkey]

--Batch Delete
delete from [dimcustomer_destination]
  Where[customerkey] in 
(
    SELECT [Customerkey] From
    [dbo]. [Stg_dimcustomer_deletes]
)

--TRUNCATE TABLE 
truncate TABLE  [dbo].[ Stg_dimcustomer_deletes]
truncate TABLE  [dbo].[ Stg_dimcustomer_updates]


The most critical step, select CDC control Task Start and switch to data flow, and drag the CDC SOURCE,CDC Splitter Transformer from top to bottom, three ado.net destination, as shown below:

The target tables for three were: [Dimcustomer_destination],stg_dimcustomer_deletes,stg_dimcustomer_updates.

The Connection manager properties for CDC source are as follows:

At this point, you can run the Delta package, but we will not see any results of the run, because at this point we have not made the insert or update operation of the data.

Down we provide a script that tests the effect:

--=============================================--Update some data to show the CDC effect in SSIS 2012---Generate by Downmoon (invitation month), 3w@live.cn --============================================= use [cdctest] Go--Transfer the remaining customer rows SET identity_ Insert DIMCUSTOMER_CDC on INSERT INTO DIMCUSTOMER_CDC (Customerkey, Geographykey, Customeralternatekey, Title, F Irstname, MiddleName, LastName, Namestyle, birthdate, MaritalStatus, Suffix, Gender, EmailAddress, Yearlyi Ncome, Totalchildren, Numberchildrenathome, Englisheducation, Spanisheducation, Frencheducation, ENGLISHOCC  Upation, Spanishoccupation, Frenchoccupation, Houseownerflag, numbercarsowned, AddressLine1, AddressLine2, 
       Phone, Datefirstpurchase, commutedistance) SELECT Customerkey, Geographykey, Customeralternatekey, Title, FirstName, MiddleName, LastName, Namestyle, birthdate, MaritalStatus, Suffix, Gender, EmailAddress, Yearlyincome, Totalc Hildren, NumberchildRenathome, Englisheducation, Spanisheducation, Frencheducation, Englishoccupation, Spanishoccupation, Frenc Hoccupation, Houseownerflag, numbercarsowned, AddressLine1, AddressLine2, Phone, Datefirstpurchase, Commutedistanc e from [AdventureWorksDW2012]. [dbo].
[DimCustomer] 
WHERE customerkey =11502 SET identity_insert dimcustomer_cdc off go-give people a raise UPDATE DIMCUSTOMER_CDC SET yearlyincome = yearlyincome + WHERE customerkey >= 11000 and Customerkey <= 11010 go

At this point, we can see the result of the change capture:

If you think it's not intuitive enough, "Enable Data Viewer",

At this point, a demo of the CDC example in SSIS 2012 ends, and if there is further research, please move to MSDN, with links below. This article also provides a sample project package for research purposes.

Project file download 1, project file download 2

This article refers to:

Http://msdn.microsoft.com/en-us/library/bb895315.aspx

http://www.mattmasson.com/index.php/2011/12/cdc-in-ssis-for-sql-server-2012-2/?utm_source=rss&utm_medium= Rss&utm_campaign=cdc-in-ssis-for-sql-server-2012-2

Invite the Month Note: This article copyright by invite month and CSDN Common all, reprint please indicate the source.
Helping others equals self-help! 3w@live.cn


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.