SQL Server change data capture (CDC)

Source: Internet
Author: User


Change data capture, or CDC, records the INSERT, update, and delete activities of SQL Server tables. Using change data capture makes it more efficient to keep track of the DML history of Table objects, which is also useful for data transfer such as ETL.


Change data capture applicable version:

Enterprise Edition, Developer Edition, and Evaluation Edition for SQL Server 2008


Change data Capture principle:

Change data capture changes the data source to the SQL Server transaction log. When change data capture is enabled on a table, a copy of the table structure is generated. When the source table is inserted, updated, and deleted, related operational information is logged in the transaction log. The change data capture agent uses the asynchronous process to read the transaction log, applying the relevant action results to the replica table (capture instance table), which completes the record tracking of the source table operations.



Change data Capture instance:


To create a test table in database MyDatabase:

--Create a test table use Mydatabasegocreate table Cdc_test (ID int. not null,name varchar (a) Not null,insertdate datetime not Null,value Numeric (14,4) not null) Goalter TABLE cdc_test ADD CONSTRAINT pk_cdc_test PRIMARY KEY CLUSTERED (ID) gocreate nonclustered I Ndex ix_cdc_test_name on Cdc_test (NAME) gocreate UNIQUE nonclustered INDEX ix_cdc_test_insertdate on cdc_test (insertDate ) GO

To view whether the database or table has CDC enabled:

---see if the database is enabled Cdcselect name,is_cdc_enabled from sys.databases WHERE is_cdc_enabled = To see if the current database table is enabled Cdcselect Name,is_ TRACKED_BY_CDC from sys.tables WHERE IS_TRACKED_BY_CDC = 1

To enable CDC for the current database:

--Enable Cdcuse Mydatabasegoexecute sys.sp_cdc_enable_db for the current database; GO

The following errors and workarounds may occur:

/* msg 22830, Level 16, State 1, procedure sp_cdc_enable_db_internal, line 186th cannot update the metadata to indicate that change data capture has been enabled for the database MyDatabase. Failed to execute command ' setcdctracked (Value = 1) '. The error returned is 15404: ' Could not get information about Windows NT Group/user ' Kk\administrator ', error code 0x54b. '。 Use this action and error to determine the cause of the failure and resubmit the request. MSG 266, Level 16, State 2, procedure sp_cdc_enable_db_internal, the transaction count after the No. 0 row of execute indicates that the number of BEGIN and COMMIT statements does not match. Previous count = 0, current count = 1. MSG 266, Level 16, State 2, procedure sp_cdc_enable_db, the transaction count after the No. 0 row of execute indicates that the number of BEGIN and COMMIT statements does not match. Previous count = 0, current count = 1. Msg 3998, Level 16, State 1, line 1th detects a non-committed transaction at the end of the batch process. The transaction is rolled back. */--reason is that the database owner is a Windows user and instead "sa" EXEC dbo.sp_changedbowner @loginame = N ' sa ', @map = falsego--dependent alias has been deleted

After you start the database CDC, then enable CDC for the specified source table:

--then enable Cdcexec sys.sp_cdc_enable_table for the specified source table @source_schema = ' dbo ',--source table schema @source_name = ' cdc_test ',--source table @role_name = ' Cdc_role '--role (will be created automatically) go--job ' CDC. Mydatabase_capture ' has successfully started. --Jobs ' CDC. Mydatabase_cleanup ' has successfully started.

When the creation is complete, you will see that the following objects are created in the database:


--You can also use the script to view information about the trace table exec sys.sp_cdc_help_change_data_capturegoexec sys.sp_cdc_help_change_data_capture ' dbo ', ' CDC_ Test ' GO

The following are the main information for each table in the system table:

" Cdc.captured_columns "

Select * from CDC . Captured_columns

Each table that enables change database capture , all of its tracked columns can be viewed in the system table (reference: CDC. captured_columns )


EXEC SYS . Sp_cdc_get_captured_columns N ' dbo_cdc_test ' ;

We recommend that you use this stored procedure to return the metadata information for a column (refer to:sys. sp_cdc_get_captured_columns )



" Cdc.change_tables "

Select * from CDC . Change_tables

each time a source table is enabled for change data capture , the table records the details of its instance table (refer to:CDC. change_tables)

EXEC SYS . sp_cdc_help_change_data_capture

EXEC SYS . sp_cdc_help_change_data_capture @source_schema = ' dbo ' , @source_name = ' Cdc_test '

you can return up to two rows for each source table , returns one row for each capture instance (refer to:sys. sp_cdc_help_change_data_capture)



" Cdc.index_columns "

Select * from CDC . Index_columns

A unique index column specified on the primary key column in the source table or when change data capture is enabled ( Specifies that the index is superior to the primary key ) record a row , change data capture uses these index columns to uniquely identify rows in the source table . By default , the primary key column of the source table is included. (Reference:CDC. ) index_columns)


EXEC SYS . sp_cdc_help_change_data_capture @source_schema = ' dbo ' , @source_name = ' Cdc_test '

Or use this stored procedure to view (for example)


" cdc.ddl_history "

Select * from CDC . ddl_history

Each data definition language that is made for tables that have change data capture enabled (DDL) Change returns one row . You can use this table to determine when a DDL change occurred in the source table and what changed. (Reference:CDC. ) ddl_history)


EXEC SYS . sp_cdc_get_ddl_history @capture_instance = ' Dbo_cdc_test '

--Test the DDL operation before viewing ALTER TABLE Cdc_test ADD info VARCHAR () ALTER TABLE cdc_test DROP COLUMN Info


" cdc.<capture_instance>_ct "

Select * from CDC . dbo_cdc_test_ct

Capture Instance Table : This is the most important table. , The table is all of the record source tables DML Operation Record . Each table corresponds to an instance table named " schema name _ table name " _ct "

each insert or delete operation applied to the source table takes one row in the change table . the data column of the row generated by the insert operation contains the column values after the insertion . the data column of the row generated by the delete operation contains the column values before the deletion . The update operation requires two rows of data : one row to identify the column values before the update , and the other to identify the updated column values .


Now, after inserting, updating, and deleting the source table, view the trace instance table:

--For related operations insert into cdc_test (id,name,insertdate,value) Select 1, ' KK ', GETDATE (), 55goupdate cdc_test set name = ' hh ', Value = where  name = ' KK ' godelete from cdc_test where id = 1goselect * from cdc.dbo_cdc_test_ct

__$start_lsn : log sequence number (LSN) associated with the committed transaction for the corresponding change

__$END_LSN : ( in SQL Server , this column is always NULL)

__$seqval : change the order of rows within a transaction

__$operation : source table DML operations

1 = delete

2 = insert

3 = Update (old value)

4 = Update (new value)

__$update_mask : Bitmask based on the column ordinal of the change table that identifies the columns that have changed


Re-test the effect of DDL on the change table:

--For related operations insert into cdc_test (Id,name,insertdate,value) SELECT 2, ' mm ', GETDATE (), 0ALTER TABLE cdc_test ADD Info VARCHAR ( 20)--Add 1 new column INSERT into Cdc_test (Id,name,insertdate,value) SELECT 2, ' mm ', GETDATE (), 0ALTER TABLE cdc_test DROP column value--Delete 1 columns insert into cdc_test (id,name,insertdate) Select 3, ' hh ', GETDATE () SELECT * from cdc.dbo_cdc_test_ct--view change table

Summary of results:

1. Newly added columns, such as info, are not added in the change table. However, trace logging operations are still possible, except that the columns in the table Cdc.captured_columns do not track records.

2. The column is deleted (such as value), and the change table is identified as null. It is also invalid to add the original field immediately and then create it.


After adding or deleting a column, there is no record tracking, is this situation increased?

One way is to:

1. Adding or deleting a field

2. Enable another change data capture for the same table (the new change table is the current structure of the source table)

3. Based on ID/time/Unique key and other new tracking table data (not real-time access to data, such as regular transfer of data, etc.)

4. If you feel that each call changes the table name, you can use the view, call the View Query change table, the view just change the corresponding table on the row.


Capture some of the constraints in the instance table:

The data type  of the Timestamp/rowversion column is defined as binary (8)  The data type of the Identity column is defined as int or bigint for LOB data type varchar (max), nvarchar ( MAX), varbinary (max), image, text, ntext, and XML, if the LOB column is updated, the value before the update is recorded in the Capture table, or the value before the update (other columns are updated immediately) is null. This saves space. Truncate table will not be able to use the SWITCH PARTITION partial row for tracking-enabled tables  will not be captured


"cdc.lsn_time_mapping"

Select * from CDC . lsn_time_mapping

When the capture process submits each batch of new change data , A new item is added to the table for each transaction that has a change table entry (refer to:CDC. lsn_time_mapping)


Recommended Implementation SYS . Fn_cdc_map_lsn_to_time and the SYS . FN_CDC_MAP_TIME_TO_LSN system Functions


Combined with cdc.fn_cdc_get_all_changes_<capture_instance> and cdc.fn_cdc_get_net_changes_<capture_instance>, You can get data changes over a range.

DECLARE @begin_time datetime, @end_time datetime, @begin_lsn Binary (Ten), @end_lsn binary (10); SET @begin_time = ' 2015-05-16 00:00:00.000 '; SET @end_time = ' 2015-05-17 00:00:00.000 '; SELECT @begin_lsn = sys.fn_cdc_map_time_to_lsn (' smallest greater than or equal ', @begin_time); SELECT @end_lsn = SYS.FN_CDC_MAP_TIME_TO_LSN (' largest less than ', @end_time);--select @begin_lsn, @end_lsnSELECT * from Cdc.fn_cdc_get_all_changes_dbo_cdc_test (@begin_lsn, @end_lsn, ' All update old ') SELECT * from Cdc.fn_cdc_get_all_chang Es_dbo_cdc_test (@begin_lsn, @end_lsn, ' all ') SELECT * from Cdc.fn_cdc_get_net_changes_dbo_cdc_test (@begin_lsn, @end_l SN, ' all ')


Statistical information: reference management and monitoring change data capture (SQL Server)

--if there is a contiguous set of empty results (for example, when the capture job is running continuously), the Empty_scan_count in the last existing row is incremented--returns a row for each error encountered in the change data capture log scan session select * FROM SYS.DM_CDC_ Errors-Returns a row for each log scan session in the current database. The last line returned represents the current session. SELECT * FROM sys.dm_cdc_log_scan_sessions--null-scanned Session SELECT * from Sys.dm_cdc_log_scan_sessions where empty_scan_count <> 0--Returns the average latency for the most recent session select latency from sys.dm_cdc_log_scan_sessions WHERE session_id = 0--Returns the average throughput of the most recent session Select Command_count/duration as [throughput] from sys.dm_cdc_log_scan_sessions WHERE session_id = 0


--View information for a capture job or cleanup job--select * from Msdb.dbo.sysjobsSELECT * from Msdb.dbo.cdc_jobsEXEC sys.sp_cdc_help_jobs


changes to the job (ref. Sys.sp_cdc_change_job)

--Changes to the job exec sys.sp_cdc_change_job  @job_type = ' capture ', @maxtrans = 1000--The maximum number of transactions that can be processed per scan cycle, @maxscans = 10-- The maximum number of scan cycles to be performed in order to fetch all rows from the log, @continuous = max continuous processing (Max_trans * Max_scans) transactions, @pollinginterval = 5EXEC sys.sp_cdc_ Change_job  @job_type = ' Cleanup ', @retention = 4320--Change the number of minutes that the row will remain in the change table, @threshold = maximum number of deleted items that can be deleted with a single statement when the 5000--is cleared-- Need to restart job after change exec sys.sp_cdc_stop_job @job_type = N ' capture '; EXEC sys.sp_cdc_stop_job @job_type = N ' capture '; EXEC sys.sp_cdc_start_job @job_type = N ' cleanup '; EXEC sys.sp_cdc_start_job @job_type = N ' cleanup ';

last entry, disable (delete) change data capture:

--Disable change data capture for table use MyDatabase; Goexec Sys.sp_cdc_disable_table@source_schema = n ' dbo ', @source_name   = n ' cdc_test ', @capture_instance = N ' dbo_cdc_ Test ' go--Disable change data capture for database use MyDatabase; Goexecute sys.sp_cdc_disable_db; When the go--execution is complete, the related tables, functions, users, roles, schemas, and jobs are completely removed!


Reference: Change data capture



SQL Server change data capture (CDC)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.