PostgreSQL continuous incremental replication
For scenarios where real-time CDC is not feasible or necessary, you can use Incremental Batch Reader as an alternative approach. You can configure this method to use scheduled polling to detect new data based on timestamp or sequence columns, providing lower resource impact than CDC while maintaining data synchronization. You can deploy this solution for batch-oriented integration patterns where near real-time replication is acceptable, and you can configure the polling intervals to match your business requirements
When you deploy Incremental Batch Reader to a Forwarding Agent, you must install the required PostgreSQL JDBC driver as described in Install the PostgreSQL JDBC driver in a Forwarding Agent.
Differences between real-time and incremental replication
Continuous real-time replication captures INSERT, UPDATE, and DELETE operations, so if a writer supports Merge mode, all changes in the source can be replicated in the target. Incremental Batch Reader treats both INSERT and UPDATE operations as inserts and does not capture deletes, so a writer in Merge mode will not be able to replicate all changes to the source in the target, and in Append Only mode, deletes in the source will not be replicated to the target.
PostgreSQL Database setup for continuous incremental replication
In PostgreSQL, create a user for use by Striim with SELECT privileges on the tables to be read.
Incremental Batch Reader properties for PostgreSQL sources
property | type | default value | notes |
|---|---|---|---|
Check Column | String | Specify the name of the column containing the start position value using the syntax If you specify multiple tables in the Tables property, you may specify different check columns for the tables separated by semicolons. In this case, you may specify the check column for the remaining tables using wildcards: for example, When rows have the same Check Column value, they will be buffered in memory until the a row with a different Check Column value is read. To avoid memory usage spikes, do not set Check Column so that many rows will have the same value. For example, selecting a single Check Column containing the day's date (rather than a timestamp) might result in thousands or millions of rows with the same value. | |
Connection Pool Size | Integer | 1 | Specify the maximum number of connections that Striim will use. This is the maximum number of tables Striim will be able to read simultaneously, provided that ThreadPool Size is set to the same or a higher number. |
Connection URL | String | When connecting through an SSH tunnel (see Using an SSH tunnel to connect to a source or target), specify the IP address of the tunnel. Specify | |
Database Provider Type | String | Default | Set to Postgres. Controls which icon appears in the Flow Designer. |
Excluded Tables | String | Data for any tables specified here will not be returned. For example, if | |
Fetch Size | Integer | 100 | Sets the maximum number of records to be fetched from the database in a single JDBC method execution (see the discussion of fetchsize in the documentation for your JDBC driver). |
Password | encrypted password | The password for the specified user. See Encrypted passwords. | |
Parallel Threads | Integer | 1 | Specify the number of threads Incremental Batch Reader will use. This value should not be higher than the number of tables to be read. When Incremental Batch Reader is started, the tables will be distributed among the threads in round-robin fashion, then each thread will read one table at a time. When the Tables property is a list of tables, the tables are read in that order. If the Tables property uses a wildcard, the sequence is determined by the order in which the JDBC driver gives Incremental Batch Reader the list of tables. Known issue DEV-49013: the maximum number of tables supported by Parallel Threads is 255. |
Polling Interval | String | 120sec | This property controls how often the adapter reads from the source. By default, it checks the source for new data every two minutes (120 seconds). If there is new data, the adapter reads it and sends it to the adapter's output stream. The value may be specified in seconds (as in the default) or milliseconds (for example, 500ms). |
Query | String | Optionally, specify a single SQL SELECT statement specifying the data to return. You may query tables, aliases, synonyms, and views. Query is not supported when Create Schema is True, Parallel Threads is greater than 1, or Restart Behavior on IL Interruption is Truncate_target_table or Replace_target_table. When If the query includes a synonym containing a period, it must be enclosed in escaped quotes. For example: If using a query when the output of a DatabaseReader source is the input of a DatabaseWriter target, specify the target table name as the value of DatabaseReader's Tables field. | |
Return DateTime As | String | Joda | Set to Set to |
Start Position | String | The value in the specified check column from which Striim will start reading. Striim will read rows in which the check column's value is the same as or greater or later than this value and skip the other rows. Since Check Column may specify multiple tables you must specify the corresponding table name or wildcard for each value. With the default value | |
Tables | String | Specify the table(s) or view(s) to be read. Specify names as You may specify multiple tables and views as a list separated by semicolons or with the If you are using the Query property, specify QUERY as the table name. Modifying this property can interfere with recovery. If recovery is enabled for the application and it is necessary to modify the tables list, export the application (see Apps page), drop it, update the exported TQL with the new values, and import it (see Apps page). | |
Threadpool Size | Integer | 5 | Specify the maximum number of threads the adapter will use. This is the maximum number of tables Striim will be able to read simultaneously, provided that Connection Pool Size is set to the same or a higher number. |
Username | String | Specify the DBMS user name the adapter will use to log in to the server specified in ConnectionURL. This user must have SELECT privileges on the tables specified in the Tables property. |
Incremental Batch Reader’s output type is WAEvent. Output is the same as for Database Reader. See Sample Database Reader WAEvent for an example.