Change Data Capture (CDC) is a software process that identifies and tracks changes in a database. There are various CDC methods, each with their pros and cons, but the main benefit of CDC is that it enables near-real-time data movement from databases. Continuous data capture is important for both analytics use cases (where timely data is used to make decisions) and zero-downtime cloud migrations (where operational databases can’t be taken offline for extended periods).
What is SQL Server?
Microsoft SQL Server is a relational database management system (RDMS) developed by Microsoft. SQL Server is mainly used to store and retrieve data. For the purposes of analysis and business intelligence, data is typically replicated to a data warehouse optimized for large analytics queries.
Change Data Capture is an ideal solution for replication of SQL Server data in near real-time. In this post, we’ll share an overview of both SQL Server’s native CDC functionality and Striim’s high-speed SQL Server CDC reader.
SQL Server CDC Methods
Native SQL Server CDC
SQL Server offers a native CDC functionality, as depicted in the diagram below.
The source table contains the data that’s being inserted, updated, and deleted by applications. The SQL Server transaction log keeps a record of the changes. If CDC is enabled, a SQL Server agent reads inserts, updates, and deletes from the transaction log. These changes are added to a separate change table, wherequery functionsare used to capture the changes so they can be delivered to a target data warehouse or other destination via a data integration process like ETL.
What are the benefits and drawbacks of the native SQL Server Change Data Capture feature?
Built-in feature, no external tools needed
Uses familiar T-SQL language
Adding a change table adds overhead to the source database
Querying change tables adds latency
Change tables need to be cleaned up on a regular basis
Since a change table is created for each CDC-enabled table, it can be challenging to consolidate changes from multiple tables (and databases)
Tutorial: How to set up the native SQL Server CDC functionality
This tutorial describes how to set up change data capture (CDC) in SQL Server using Microsoft’s AdventureWorks sample database to illustrate the steps.
high-speed capture (more than 7x faster than the native SQL Server CDC feature)
exactly-once processing (E1P) to prevent data loss or duplication
support for long-running transactions
As shown in the diagram above, Striim’s SQL Server CDC Reader doesn’t require the use of a change table. Striim reads the change data from the SQL Server transaction logs; performs in-memory transformations, enrichment and analysis; and delivers data to target systems…with millisecond latencies.
Striim enables users to migrate and replicate data from SQL Server databases to cloud and on-premise data warehouses, data lakes, messaging systems, files, and big data targets.