Building pipelines from BigQuery
You can read from BigQuery as follows, and write to any target supported by Striim. Typically, you will set up data pipelines that read from BigQuery in two phases—initial load, followed by continuous replication—as explained in this concept article on Pipelines.
For initial load, you can use Database Reader to create a point-in-time copy of the existing source BigQuery dataset at the target, as described in BigQuery initial load.
After initial load has completed, you can use Incremental Batch Reader to read the new source data at regular intervals, as described in BigQuery continuous incremental replication, allowing for continuous updates in near real time. Note that Incremental Batch Reader uses JDBC rather than CDC , including not capturing DELETE operations at the source (see
Before building a pipeline, you must complete the steps described in BigQuery initial setup.
To create separate applications for initial load and continuous replication:
Create a schema and tables in the target and perform initial load: use a wizard with a Database Reader source.
Perform an initial load when the schema and tables already exist in the target: use a wizard with a Database Reader source.
Switch from initial load to continuous replication: see Switching from initial load to continuous replication of BigQuery sources.
Replicate new data: use a wizard with an Incremental Batch Reader source.
Alternatively, instead of using wizards, you can create applications using Flow Designer, TQL, or Striim's REST API.