Incremental Batch Reader
Works like DatabaseReader but has two additional properties, Check Column and Start Position, which allow you to specify that reading will begin at a user-selected position. To specify the starting point, the table(s) to be read must have a column containing either a timestamp or a sequential number. The most common use case is for populating data warehouses.
See also Spanner Batch Reader.
See Connection URL below for a list of supported databases. See Database Reader for supported data types.
Warning
Except for Oracle, PostgreSQL, and SQL Server 2012 or later, the JDBC driver for the DBMS must be installed as described in Installing database drivers.
property | type | default value | notes |
---|---|---|---|
Check Column | String | Specify the name of the column containing the start position value. The values must be unique and continuously increasing, such as the creation timestamp or a unique identifier such as an employee ID. MySQL and Oracle names are case-sensitive, SQL Server names are not. Use the syntax If you specify multiple tables in the Tables property, you may specify different check columns for the tables separated by semicolons. In this case, you may specify the check column for the remaining tables using wildcards: for example, | |
Connection URL | String | The following databases are supported.
| |
Database Provider Type | String | Default | Controls which icon appears in the Flow Designer. |
Excluded Tables | String | When a wildcard is specified for Tables='HR%', ExcludedTables='HRMASTER' | |
Fetch Size | Integer | 100 | maximum number of records to be fetched from the database in a single JDBC method execution (see the discussion of fetchsize in the documentation for the your JDBC driver) |
Password | encrypted password | The password for the specified user. See Encrypted passwords. | |
Polling Interval | String | 120sec | time to wait between fetches; may be specified in seconds (as in the default) or milliseconds (for example, 500ms) |
Return DateTime As | String | Joda | Set to |
Start Position | String | The value in the specified check column from which Striim will start reading. Striim will read rows in which the check column's value is the same as or greater or later than this value and skip the other rows. Since Check Column may specify multiple tables you must specify the corresponding table name or wildcard for each value. With the Check Column example above, the Start Position value could be | |
Tables | String | The table(s) or view(s) to be read. MySQL and Oracle names are case-sensitive, SQL Server names are not. Specify names as For Oracle and SQL Server, you may specify multiple tables and views as a list separated by semicolons or with the following wildcards:
For example, HR.% would read all tables in the HR schema. When reading from Oracle, _ is a literal underscore, so, for example, HR_% would include HR_EMPLOYEES and HR_DEPTS but not HRMASTER. (Note that when using OracleReader the opposite is the case: When reading from SQL Server, there is no way to specify a literal underscore. | |
Username | String | the DBMS user name the adapter will use to log in to the server specified in ConnectionURL | |
Vendor Config | String | If the source is Oracle and it uses SSL, specify the required SSL properties (see the notes on SSL Config in Oracle Reader properties). |
The following would read rows from TABLE1 with UUID column values equal to or greater than 1234:
CREATE SOURCE OraSource USING IncrementalBatchReader ( Username: 'striim', Password: '********', ConnectionURL: '192.0.2.:1521:orcl', Tables: 'MYSCHEMA.TABLE1', CheckColumn: 'MYSCHEMA.TABLE1=UUID', StartPosition: 'MYSCHEMA.TABLE1=1234' ) OUTPUT TO OraSourceOutput;
If IncrementalBatchReader sends duplicate records to a DatabaseWriter target, by default the application will crash. This can happen, for example, when recovery is enabled (see Recovering applications), there are multiple rows with the same CheckColumn timestamp, and only some of them were written before a system failure, To avoid this, specify the appropriate IgnorableException in the target: for example, for CosmosDBWriter, RESOURCE_ALREADY_EXISTS.