A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
With a few gaps in the latest projects, starting with some of the bi features of SQL Server 2012 and 2014, referring to an example from Matt, we started to experience the CDC in SSIS (change data Capture).
Note: If you need to know about the CDC in SQL Server 2008, see here http://blog.csdn.net/downmoon/article/details/7443627), this article assumes that readers have an understanding of how the CDC works. ^_^.
We complete the example in three steps:
1, prepare the basic data;
2, the design of an initial package;
3, on the basis of 2 design an incremental package.
First, complete the following ready installation:
(1) Visual Studio 2012 or Visual Studio Shell (Isolated) Redistributable Package
(2) SQL Server Data tools-business Intelligence for Visual Studio 2012
(2) SQL Server 2012 Enterprise Edition or Development edition
(3) Sample Database AdventureWorksDW2012 (This article must, if you do not need to build the table)
Okay, first step:
/*- =============================================- Create test database and datasheet, with AdventureWorksDW2012 sample database --- Generate by Downmoon (invitation month), firstname.lastname@example.org -============================================= * * * --create Database cdctest --go --use [cdctest] --go--select * into DIMCUSTOMER_CDC--from [ ADVENTUREWORKSDW2012]. [dbo]. [DimCustomer] --where Customerkey < 11500; --select * from DIMCUSTOMER_CDC;
/*- =============================================- enables the database-level CDC, only valid for enterprise and Development Editions ---Generate by Downmoon ( Invitation month), email@example.com -============================================= * * [cdctest] Go EXEC sys.sp_cdc_enable_db Go --Add a primary key to the DIMCUSTOMER_CDC table so we can enable support for net C Hanges IF not EXISTS (SELECT * from sys.indexes WHERE object_id = object_id (N ' [dbo].[ DIMCUSTOMER_CDC] and name = N ' PK_DIMCUSTOMER_CDC ') ALTER TABLE [dbo].[ DIMCUSTOMER_CDC] ADD CONSTRAINT [PK_DIMCUSTOMER_CDC] PRIMARY KEY CLUSTERED ( [Customerkey] ASC ) Go/ *-=============================================- enable table level CDC ---Generate by Downmoon (Invited month), firstname.lastname@example.org -=============================================/ EXEC sys.sp_cdc_ enable_table @source_schema = n ' dbo ', @source_name = n ' dimcustomer_cdc ', @role_name = n ' cdc_admin ', @supports_net_changes = 1 Go
/*- =============================================- creates a target table with the same table structure as the source table -Note that in a production environment, Can be a different instance or server, in this case, for convenience, demonstrate ---Generate by Downmoon (invitation month) in the same database instance of the same database, email@example.com -===================== ======================== / SELECT top 0 * to dimcustomer_destination from DIMCUSTOMER_CDC -- SELECT @ @version; SELECT * from Dimcustomer_destination;
Step Two: Create the initial package
--=============================================-- We use two packages to complete the sample, an initial package completes the initial loading of the data, and an incremental package completes the change capture of the data --- Generate by Downmoon (invitation month), firstname.lastname@example.org --=============================================
The initial package contains the following logic:
(1) Initial load start LSN using CDC control task token (use the CDC control task to mark the initial load start LSN)
(2) Convert all source table data to the target table (Transfer all of the "data" from the "source table into" Our destination table)
(3) Initial load ending LSN using CDC control task token (use the CDC control task to mark the initial load end LSN)
Create a new SSIS project, creating a package "Initial Load", as shown in the following figure:
New two CDC control task, named "CDC control task Start" and "CDC control Task", respectively, corresponding to the attribute "Mark Initial load STA RT "and" "Mark Initial Load End"
Connection Manager is Ado.net, and the other properties are as follows:
Add a "Data flow Task" in the middle, and the property defaults.
At this point, run the package, visible cdc_states has the initial tag.
Step three: Create an incremental package
The delta package contains the following logic:
(1) Create a Connection manager for the source database (Create a Connection manager for the "source")
(2) Set the CDC operator to obtain the processing boundary (set the CDC control Operation to get processing range)
(3) Creation of a new CDC state variable (cdc_state) (Create a, new, variable (cdc_state))
(4) Create a Connection manager for the target database (create a Connection manager for the destination database)
(5) Select the state table created by the previous initial load package (this is created by the Initial load package) –[dbo].[ Cdc_states])
(6) Set the status name (must match the state name used by the initial add-in package, this must match what is used in the Initial load package (cdc_state))
Create a new package in the project named "Incremental Load"
In the control flow view of the package, manually 6 tasks from top to bottom, in the order shown below, to remove the three tasks used above and the rest are execute SQL task
Note: The CDC operator for CDC control task end is the MARK process range, and CDC control task start's CDC operator is get Process range
The SQL statements for the remaining 4 execute SQL tasks are as follows:
--create stage Tables IF not EXISTS (SELECT * from sys.objects WHERE object_id = object_id (N ' [dbo].[ Stg_dimcustomer_updates] ') and type in (N ' U '), BEGIN SELECT top 0 * to Stg_dimcustomer_updates from Dimcustomer_destination End IF is EXISTS (SELECT * from sys.objects WHERE object_id = object_id (N ' [dbo].[ Stg_dimcustomer_deletes] ') and type in (N ' U '), BEGIN SELECT top 0 * to Stg_dimcustomer_deletes from Dimcustomer_destination End
--Batch update update dest SET dest. FirstName = STG. FirstName, dest. MiddleName = STG. MiddleName, dest. LastName = STG. LastName, dest. Yearlyincome = STG. Yearlyincome from [dimcustomer_destination] dest, [stg_dimcustomer_updates] STG WHERE Stg.[customerkey] = dest. [Customerkey]
--Batch Delete delete from [dimcustomer_destination] Where[customerkey] in ( SELECT [Customerkey] From [dbo]. [Stg_dimcustomer_deletes] )
--TRUNCATE TABLE truncate TABLE [dbo].[ Stg_dimcustomer_deletes] truncate TABLE [dbo].[ Stg_dimcustomer_updates]
The most critical step, select CDC control Task Start and switch to data flow, and drag the CDC SOURCE,CDC Splitter Transformer from top to bottom, three ado.net destination, as shown below:
The target tables for three were: [Dimcustomer_destination],stg_dimcustomer_deletes,stg_dimcustomer_updates.
The Connection manager properties for CDC source are as follows:
At this point, you can run the Delta package, but we will not see any results of the run, because at this point we have not made the insert or update operation of the data.
Down we provide a script that tests the effect:
--=============================================--Update some data to show the CDC effect in SSIS 2012---Generate by Downmoon (invitation month), email@example.com --============================================= use [cdctest] Go--Transfer the remaining customer rows SET identity_ Insert DIMCUSTOMER_CDC on INSERT INTO DIMCUSTOMER_CDC (Customerkey, Geographykey, Customeralternatekey, Title, F Irstname, MiddleName, LastName, Namestyle, birthdate, MaritalStatus, Suffix, Gender, EmailAddress, Yearlyi Ncome, Totalchildren, Numberchildrenathome, Englisheducation, Spanisheducation, Frencheducation, ENGLISHOCC Upation, Spanishoccupation, Frenchoccupation, Houseownerflag, numbercarsowned, AddressLine1, AddressLine2, Phone, Datefirstpurchase, commutedistance) SELECT Customerkey, Geographykey, Customeralternatekey, Title, FirstName, MiddleName, LastName, Namestyle, birthdate, MaritalStatus, Suffix, Gender, EmailAddress, Yearlyincome, Totalc Hildren, NumberchildRenathome, Englisheducation, Spanisheducation, Frencheducation, Englishoccupation, Spanishoccupation, Frenc Hoccupation, Houseownerflag, numbercarsowned, AddressLine1, AddressLine2, Phone, Datefirstpurchase, Commutedistanc e from [AdventureWorksDW2012]. [dbo]. [DimCustomer] WHERE customerkey =11502 SET identity_insert dimcustomer_cdc off go-give people a raise UPDATE DIMCUSTOMER_CDC SET yearlyincome = yearlyincome + WHERE customerkey >= 11000 and Customerkey <= 11010 go
At this point, we can see the result of the change capture:
If you think it's not intuitive enough, "Enable Data Viewer",
At this point, a demo of the CDC example in SSIS 2012 ends, and if there is further research, please move to MSDN, with links below. This article also provides a sample project package for research purposes.
Project file download 1, project file download 2
This article refers to:
Invite the Month Note: This article copyright by invite month and CSDN Common all, reprint please indicate the source.
Helping others equals self-help! firstname.lastname@example.org
Start building with 50+ products and up to 12 months usage for Elastic Compute Service