For most enterprise applications, there is a basic function that is essential, namely, Audit Trail or Audit Log. Chinese translation is a tracking check, review check or review record. We use Audit Trail to record the basic information of each business operation, such as the basic description, operation time, and operator of the operation. For some applications with high security level or sensitive data, we even need to record the data changes caused by this business operation. Specifically, the "Data Change" here refers to the changes before and after the operation of each affected record. For added records, you need to record the newly inserted records; for deleted records, you need to record the original records; for updated records, you need to record the records before and after the update at the same time.
At this point, many people will think of using triggers to capture data changes. However, this implementation scheme has one of the biggest limitations: Because the trigger is executed within the transaction scope where data operations are located, all performance problems may occur, in severe cases, the execution of the trigger will lead to the transaction supermarket. So here we will introduce a better solution: SQLCDC.
Directory
1. SQLCDC Introduction
2. Enable CDC at the database level
3. Enable CDC for a data table
4. record the data changes of the add record
5. Record data changes in Deleted Data
6. Record data changes in update records
1. SQLCDC Introduction
The full name of CDC is Change Data Capture, which is used to track and Capture Data changes. CDC is a new feature in SQL Server 2008, which appeared in Oracle a long time ago. For versions earlier than SQL Server, if there is no CDC, we can only use triggers to record data changes based on a specific data table, specifically, you can manually create After Insert, After Update, and After Delete triggers to record changed data. CDC provides us with a more convenient, easy-to-use and worry-free way to record the historical operations of a data table.
2. Enable CDC at the database level
By default, the CDC feature of a database is disabled. You can use the is_cdc_enabled field of the system table sys. databases to check whether the CDC feature of a database is enabled. By default, if I run the following SQL statement to check whether the CDC of TestDb is enabled, the field value is 0.
You can enable the CDC feature for the current database by executing the system stored procedure sys. sp_cdc_enable_db. In the following T-SQL code snippet, we opened the CDC feature for TestDb by executing this stored procedure.
Use TestDb
Go
Exec sys.sp_cdc_enable_db
Go
3. Enable CDC for a data table
Since CDC is used to record data changes based on a data table, when the current database CDC is enabled, you also need to explicitly enable the CDC feature for a data table. As a demonstration, we created a simple Users table in TestDb through the following T-SQL, which has only three fields: Id, Name, and Birthday.
CREATE TABLE [dbo].[Users](
[Id] [varchar](50) PRIMARY KEY,
[Name] [nvarchar](50) NOT NULL,
[Birthday] [date] NOT NULL)
To enable the CDC feature of a data table, run the sys. sp_cdc_enable_table stored procedure. The simplest way to call this stored procedure is to specify the Schema and name of the data table and the permissions (roles) required for extracting and changing data ). I open the CDC feature of the Users table we created by executing the following T-SQL, where the @ role_name parameter is set to NULL, indicating that I do not authorize the read change data operation. Sys. sp_cdc_enable_table has many parameters. For the CDC behavior affected by the corresponding parameters, refer to the SQL Server 2008 online document.
Use TestDb
Go
Exec sys.sp_cdc_enable_table 'dbo', 'Users', @role_name = NULL
Go
Note that CDC is actually built on the SQL Server Agent, so you need to start the SQL Server Agent before executing the T-SQL. When the CDC feature of a data Table is enabled, the system creates a Tracking Table for saving data changes ). The Schema of the table is cdc, and the naming method is "CT" after the name of the tracked table. After executing the above T-SQL, we will see that the three fields of the Users table are also in this table as the next system table is created. In addition. The table also has five additional fields: __$ start_lsn, __$ end_lsn, __$ seqval, __$ operation and __$ update_mask, indicating the Log Series Number (Log Sequence Number), operation (delete, insert, before and after modification) information.
4. record the data changes of the add record
Now we can test the CDC's Data Change capture function for a data table. Let's first try adding records. To do this, execute the following T-SQL to insert two User records.
Insert Into Users(Id, Name, Birthday)
Values ('001','Foo','1981-08-24')
Insert Into Users(Id, Name, Birthday)
Values ('002','Bar','1981-08-24')
Then you can view the data in the cdc. dbo_Users_CT table through the following T-SQL to see if the data changes involved in the add operation are saved. From the query results, we can clearly see that the two records added above have been recorded, and the field __$ operation is 2, which indicates the "insert" operation.
5. record changes to updated data
Next let's go to the CDC's tracking record for update operations, so we changed user Foo's Birthday through the following T-SQL.
Update Users
Set Birthday = '1982-7-10'
Where Name = 'Foo'
Execute the full table query for cdc. dbo_Users_CT again. You will see two more records this time. Among them, the first record is the data before modification, and the fourth record is the data after modification. Their __$ operation field values are 3 and 4, respectively.
It is worth mentioning here that the value of the __$ update_mask field represents the field changed in the record update operation. This is a number in hexadecimal notation. You need to convert it to a binary notation when determining the modified field. The value of __$ update_mask corresponding to the above update operation is 0x04, which is converted to a binary system of 100. These three fields represent three fields respectively. However, the order here is from right to left, so the three fields represented by 100 are Birthday, Name, and Id. 1 indicates a change, and 0 indicates a change. Since in the above T-SQL we only changed Birthday, this is consistent with the value of 100.
6. Record data changes in the deletion record
Finally, we will demonstrate what data CDC will record for us when we delete records. Now we execute the following T-SQL to delete all records in the Users table.
Delete From Users
View the records of cdc. dbo_Users_CT. The two more records are deleted, and the value of __$ operation field 1 indicates the "delete" operation.
This article only briefly introduces the basic principles and general usage of SQLCDC. This article titled Introduction to Change Data Capture (CDC) in SQL Server 2008 [go] will give you a more detailed introduction. If you want to study SQLCDC in depth, refer to SQL Server 2008 online documentation.
A complete solution for tracking data changes