Striim 3.9.6 documentation

Incremental Batch Reader

Works like DatabaseReader but has two additional properties, Check Column and Start Position, which allow you to specify that reading will begin at a user-selected position. To specify the starting point, the table(s) to be read must have a column containing either a timestamp or a sequential number. The most common use case is for populating data warehouses.

Supported databases are MySQL, Oracle, PostgreSQL, SQL Server, and Teradata. See Database Reader for supported data types and conversions.Database Reader

Warning

Except for PostgreSQL, the JDBC driver for the DBMS must be installed as described in Installing database drivers.

When using IncrementalBatchReader as a source, Azure SQL DWH WriterHive Writer, and Redshift Writer support SQL MERGE, so updates in the source database are replicated in the target. Other targets support only SQL INSERT, so updates in the source database become inserts in the target.Redshift Writer

property

type

default value

notes

Check Column

java.lang.String

Specify the name of the column containing the start position value. The values must be unique and continuously increasing, such as the creation timestamp or a unique identifier such as an employee ID.

MySQL and Oracle names are case-sensitive, SQL Server names are not. Use the syntax <schema name>.<table name>=<column name> for MySQL and Oracle and <database name>.<schema name>.<table name>=<column name> for SQL Server.

If you specify multiple tables in the Tables property, you may specify different check columns for the tables separated by semicolons. In this case, you may specify the check column for the remaining tables using wildcards: for example, MYSCHEMA.TABLE1=UUID; MYSCHEMA.%=LAST_UPDATED would use UUID as the start column for TABLE1 and LAST_UPDATED as the start column for the other tables.

Connection URL

java.lang.String

  • for MySQL: jdbc:mysql://<ip address>:<port>/<database name>

  • for Oracle: jdbc:oracle:thin:@<hostname>:<port>:<SID> (using Oracle 12c with PDB, use the SID for the PDB service) or jdbc:oracle:thin:@<hostname>:<port>/<service name>

  • for PostgreSQL, jdbc:posgresql://<ip address>:<port>/<database name>

  • for SQL Server: jdbc:sqlserver://<ip address>:<port>;DatabaseName=<database name>

  • for Teradata: jdbc:teradata://<ip address>/DBS_PORT=<port>,DATABASE=<database name>

Database Provider Type

java.lang.String

Default

Controls which icon appears in the Flow Designer.

Excluded Tables

java.lang.String

When a wildcard is specified for Tables, you may specify here any tables you wish to exclude from the query. Specify the value as for Tables. For example, to include data from HR_EMPLOYEES and HR_DEPTS but not HRMASTER when reading from SQL Server (since you cannot specify a literal underscore in the Tables string):

Tables='HR%',
ExcludedTables='HRMASTER'

Fetch Size

java.lang.Integer

100

maximum number of records to be fetched from the database in a single JDBC method execution (see the discussion of fetchsize in the documentation for the your JDBC driver)

Password

com.webaction. security.Password

The password for the specified user. See Encrypted passwords.

Polling Interval

java.lang.String

120sec

number of seconds to watch between fetches

Return DateTime As

java.lang.String

Joda

Set to String to return timestamp values as strings rather than Joda timestamps. The primary purpose of this option is to avoid losing precision when microsecond timestamps are converted to Joda milliseconds. The format of the string is yyyy-mm-dd hh:mm:ss.ffffff.

Start Position

java.lang.String

The value in the specified check column from which Striim will start reading. Striim will read rows in which the check column's value is the same as or greater or later than this value and skip the other rows. Since Check Column may specify multiple tables you must specify the corresponding table name or wildcard for each value. With the Check Column example above, the Start Position value could be MYSCHEMA.TABLE1=1234; MYSCHEMA.%=2018-OCT-07 18:37:55.

Tables

java.lang.String

The table(s) or view(s) to be read. MySQL and Oracle names are case-sensitive, SQL Server names are not. Specify names as <schema name>.<table name> for MySQL and Oracle and as <database name>.<schema name>.<table name> for SQL Server.

For Oracle and SQL Server, you may specify multiple tables and views as a list separated by semicolons or with the following wildcards:

  • %: any series of characters

  • \_: any single character in MySQL or Oracle (note that the underscore does not need to be escaped when using OracleReader)

  • _: any single character in SQL Server

For example, HR.% would read all tables in the HR schema.

When reading from Oracle, _ is a literal underscore, so, for example, HR_% would include HR_EMPLOYEES and HR_DEPTS but not HRMASTER. (Note that when using OracleReader the opposite is the case: _ is the wildcard and \_ is a literal underscore.)

When reading from SQL Server, there is no way to specify a literal underscore.

Username

java.lang.String

the DBMS user name the adapter will use to log in to the server specified in ConnectionURL

Vendor Config

java.lang.string

If the source is Oracle and it uses SSL, specify the required SSL properties (see the notes on SSL Config in OracleReader properties).

The following would read rows from TABLE1 with UUID column values equal to or greater than 1234: 

CREATE SOURCE OraSource USING IncrementalBatchReader  ( 
  Username: 'striim',
  Password: '********',
  ConnectionURL: '192.0.2.:1521:orcl',
  Tables: 'MYSCHEMA.TABLE1',
  CheckColumn: 'MYSCHEMA.TABLE1=UUID',
  StartPosition: 'MYSCHEMA.TABLE1=1234'
) 
OUTPUT TO OraSourceOutput;

If IncrementalBatchReader sends duplicate records to a DatabaseWriter target, by default the application will crash. This can happen, for example, when recovery is enabled (see Recovering applications), there are multiple rows with the same CheckColumn timestamp, and only some of them were written before a system failure,  To avoid this, specify the appropriate IgnorableException in the target: for example, for CosmosDBWriter, RESOURCE_ALREADY_EXISTS.Recovering applications