Analysis of the change Data capture function of SQL 2008

Source: Internet
Author: User
Tags sql 2008

One of the tasks in common enterprise data Platform management is the constant obsession with SQL Server DBAs, which is monitoring data updates. Many data applications need to capture updates to the business data table. I have seen several solutions:

1, in the data table to add a special sign column;

2, by creating a trigger on the datasheet;

3, through Third-party products, such as Lumigent log Explorer.

In fact, both the 1th and 2nd schemes are not good, because the 1th approach needs to be especially cautious when the application is coded, a section of the data access logic that forgets to update the flag will cause some data updates to be omitted, while the 2nd method has a significant performance impact because the performance overhead of triggers is well known. The 3rd approach actually belongs to a scheme called log audit. Because SQL Server, like other relational databases, all data operations are recorded in the log, a complete history of data manipulation can be obtained by analyzing the logs. SQL Server already has an internal API that can be used by ISV developers for log audit, but Microsoft has a tighter control over the API, and only the core partners who have signed a bunch of protocols can understand the API.

As a result, the current tracking of business data updates has always been a headache for the SQL Server platform, and users need to make a choice between putting in a lot of development effort and investing in additional procurement costs. Fortunately, Microsoft has finally provided a set of audit log mechanisms in SQL Server 2008, the change Data Capture we call the CDC.

How the CDC works

As we mentioned earlier, the CDC works through the analysis log to obtain data operation history information, so what is the CDC's working principle? The following diagram illustrates the rationale of this feature very aptly:

Figure 1

When DML commits to the application database, SQL Server must write to the log, update the data in the cache, and then brush the data back into the data file in the checkpoint.

The CDC's internal process extracts the update history from the log file according to the CDC settings and writes the updated information to the corresponding Update tracking table.

The DBA or the developer accesses the update tracking table by invoking the CDC functions, extracting the updated history information of interest, and updating the Data Warehouse through an ETL application.

Theory update the Tracking table will grow indefinitely, so there is a cleanup process within the CDC, and by default the update tracking information is automatically cleaned up three days after it is written to the trace table.

The CDC configuration

Because the CDC is a relatively high-end feature, CDC functionality can only be found in the enterprise, development, and Evaluation editions of SQL Server 2008.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.