Skip to main content

SQL Server continuous incremental replication

For scenarios where real-time CDC is not feasible or necessary, you can use Incremental Batch Reader as an alternative approach. You can configure this method to use scheduled polling to detect new data based on timestamp or sequence columns, providing lower resource impact than CDC while maintaining data synchronization. You can deploy this solution for batch-oriented integration patterns where near real-time replication is acceptable, and you can configure the polling intervals to match your business requirements

When you deploy Incremental Batch Reader to a Forwarding Agent, you must install the SQL Server JDBC driver as described in Install the Microsoft JDBC Driver for SQL Server 2008 in a Forwarding Agent.

Differences between real-time and incremental replication

Continuous real-time replication captures INSERT, UPDATE, and DELETE operations, so if a writer supports Merge mode, all changes in the source can be replicated in the target. Incremental Batch Reader treats both INSERT and UPDATE operations as inserts and does not capture deletes, so a writer in Merge mode will not be able to replicate all changes to the source in the target, and in Append Only mode, deletes in the source will not be replicated to the target.

SQL Server setup for initial load

In SQL Server, create a user for use by Striim with SELECT privileges on the tables to be read.

Incremental Batch Reader properties for SQL Server sources

property

type

default value

notes

Check Column

String

Specify the name of the column containing the start position value using the syntax <database name>.<schema name>.<table name>=<column name>. The column must have an integer or timestamp data type (such as the creation timestamp or an employee ID number) and the values must be unique and continuously increasing. Names are not case-sensitive.

If you specify multiple tables in the Tables property, you may specify different check columns for the tables separated by semicolons. In this case, you may specify the check column for the remaining tables using wildcards: for example, MYSCHEMA.TABLE1=UUID; MYSCHEMA.%=LAST_UPDATED would use UUID as the start column for TABLE1 and LAST_UPDATED as the start column for the other tables.

When rows have the same Check Column value, they will be buffered in memory until the a row with a different Check Column value is read. To avoid memory usage spikes, do not set Check Column so that many rows will have the same value. For example, selecting a single Check Column containing the day's date (rather than a timestamp) might result in thousands or millions of rows with the same value.

Connection Pool Size

Integer

1

Specify the maximum number of connections that Striim will use. This is the maximum number of tables Striim will be able to read simultaneously, provided that ThreadPool Size is set to the same or a higher number.

Connection URL

String

When connecting through an SSH tunnel (see Using an SSH tunnel to connect to a source or target), specify the IP address of the tunnel.

Specify jdbc:sqlserver://<ip address>:<port>;DatabaseName=<database name>

Database Provider Type

String

Default

Set to SQL Server. Controls which icon appears in the Flow Designer.

Excluded Tables

String

Data for any tables specified here will not be returned. For example, if Tables uses a wildcard, data from any tables specified here will be omitted. Multiple table names (separated by semicolons) and wildcards may be used exactly as for Tables.

Fetch Size

Integer

100

Sets the maximum number of records to be fetched from the database in a single JDBC method execution (see the discussion of fetchsize in the documentation for your JDBC driver).

Password

encrypted password

The password for the specified user. See Encrypted passwords.

Parallel Threads

Integer

1

For Fast Snapshot Loading to Azure Synapse, BiqQuery, Databricks, Fabric Data Warehouse, Microsoft Dataverse, or Snowflake, specify the number of threads Database Reader will use. This value should not be higher than the number of tables to be read. For other targets, leave set to the default of 1.

When Database Reader is started, the tables will be distributed among the threads in round-robin fashion, then each thread will read one table at a time. When the Tables property is a list of tables, the tables are read in that order. If the Tables property uses a wildcard, the sequence is determined by the order in which the JDBC driver gives Database Reader the list of tables.

Known issue DEV-49013: the maximum number of tables supported by Parallel Threads is 255.

Polling Interval

String

120sec

This property controls how often the adapter reads from the source. By default, it checks the source for new data every two minutes (120 seconds). If there is new data, the adapter reads it and sends it to the adapter's output stream. The value may be specified in seconds (as in the default) or milliseconds (for example, 500ms).

Query

String

Optionally, specify a single SQL SELECT statement specifying the data to return. You may query tables, aliases, synonyms, and views.

Query is not supported when Create Schema is True, Parallel Threads is greater than 1, or Restart Behavior on IL Interruption is Truncate_target_table or Replace_target_table.

When Query is specified and Tables is not, the WAEvent TableName metadata field value will be QUERY. When both Query and Tables are specified, the data specified by Query will be returned, and the Tables setting will be used only to populate the TableName field.

If the query includes a synonym containing a period, it must be enclosed in escaped quotes. For example: select * from \"synonym.name\"

If using a query when the output of a DatabaseReader source is the input of a DatabaseWriter target, specify the target table name as the value of DatabaseReader's Tables field.

Return DateTime As

String

Joda

Set to joda to return timestamp values as Joda timestamps.

Set to String to return timestamp values as strings. The primary purpose of this option is to avoid losing precision when microsecond timestamps are converted to Joda milliseconds. The format of the string is yyyy-mm-dd hh:mm:ss.ffffff.

SSL Config

String

Not applicable to SQL Server in this release.

Start Position

String

The value in the specified check column from which Striim will start reading. Striim will read rows in which the check column's value is the same as or greater or later than this value and skip the other rows. Since Check Column may specify multiple tables you must specify the corresponding table name or wildcard for each value. With the default value %=-1, Striim will read all data from all tables specified in the Tables property. With the Check Column example above, the Start Position value could be MYSCHEMA.TABLE1=1234; MYSCHEMA.%=2018-OCT-07 18:37:55. Note that this timestamp format is just an example: your source may require a different java.sql.timestamp format, such as yyyy-mm-dd hh:mm:ss, to match the check column's data type.

Tables

String

Specify the table(s) or view(s) to be read. Specify names as <database name>.<schema name>.<table name>. Names are not case-sensitive.

You may specify multiple tables and views as a list separated by semicolons or with the % wildcard. For example, HR% would read all the tables whose names start with HR. You may use the % wildcard only for tables, not for schemas or databases. The wildcard is allowed only at the end of the string: for example, mydb.prefix% is valid, but mydb.%suffix is not.

If you are using the Query property, specify QUERY as the table name.

Modifying this property can interfere with recovery. If recovery is enabled for the application and it is necessary to modify the tables list, export the application (see Apps page), drop it, update the exported TQL with the new values, and import it (see Apps page).

Threadpool Size

Integer

5

Specify the maximum number of threads the adapter will use. This is the maximum number of tables Striim will be able to read simultaneously, provided that Connection Pool Size is set to the same or a higher number.

Username

String

Specify the user name the adapter will use to log in to the server specified in ConnectionURL. This user must have SELECT permission or privileges on the tables specified in the Tables property.

Incremental Batch Reader’s output type is WAEvent. Output is the same as for Database Reader. See Sample Database Reader WAEvent for an example.