Skip to main content

BigQuery initial load

You can use Database Reader for initial load to create a point-in-time copy of your existing source BigQuery dataset at the target. This establishes a baseline for subsequent continuous updates.

You can leverage Database Reader's parallel processing capabilities to accelerate data loading for large datasets, while the system automatically handles schema creation and data type mapping between BigQuery and target systems. You can configure table selection with wildcard support, create custom SQL queries for selective data extraction, and optimize batch sizes for performance tuning..

The required BigQuery JDBC driver is bundled with Striim Platform.

Database Reader properties for BigQuery sources

property

type

default value

notes

Connection Profile Name

enum

Appears in Flow Designer only when Use Connection Profile is True. See Connection profiles.

Connection URL

String

Specify as jdbc:googlebigquery:AuthScheme=OAuthJWT;OAuthJWTCert=<path to service account key>;InitiateOAuth=GETANDREFRESH;ProjectId=<BigQuery project ID> or use a connection profile (see Connection profiles).

If Striim Platform is running in a VM in Google Cloud Platform, you may use the following syntax to use the service account key associated with the VM: jdbc:googlebigquery:AuthScheme=GCPInstanceAccount;ProjectId=<BigQuery project ID>.

Create Schema

Boolean

False

If set to True, when Database Reader's output stream is the input stream of an Azure Synapse Writer, BigQuery Writer, Database Writer (for MariaDB, MariaDB Xpand, MySQL, Oracle, PostgreSQL, SQL Server, or YugabyteDB only), Databricks Writer, Fabric Data Warehouse Writer, Fabric Mirror Writer, Iceberg Writer, Snowflake Writer, or Spanner Writer target, the schema(s) and tables specified in the Tables property in the writer will be created in the target.

For example, if the Tables property in the writer is sourceschema1.%,targetschema1.%;sourceschema2.%,targetschema2.% then the schemas targetschema1 and targetschema2 will be created in the target, and all the tables in sourceschema1 and sourceschema2 will be created in the corresponding target schema.

Data types will be mapped as per Target data type support & mapping for Snowflake sources. Single-column PRIMARY KEY constraints will be replicated in the target. DEFAULT, FOREIGN KEY, NOT NULL, composite PRIMARY KEY, and UNIQUE KEY constraints will not be replicated.

If a table already exists in the target, Striim will verify that the source and target structure match. If they do not, the application will halt and log a TargetTableSchemaMismatchException.

MON output for the target tables will include a schemaCreationStatus of Pending, InProgress, Success, or Failed.

Database Provider Type

String

Default

Set to BigQuery. Controls which icon appears in the Flow Designer and whether Use Connection Profile appears in the web UI.

Excluded Tables

String

Data for any tables specified here will not be returned. For example, if Tables uses a wildcard, data from any tables specified here will be omitted. Multiple table names (separated by semicolons) and wildcards may be used exactly as for Tables.

Fetch Size

Integer

100

Sets the maximum number of records to be fetched from the database in a single JDBC method execution (see the discussion of fetchsize in the documentation for your JDBC driver).

JAAS Configuration

String

Not applicable to BigQuery in this release.

Password

encrypted password

Not used when reading from BigQuery.

Parallel Threads

Integer

1

Not applicable when reading from BigQuery.

Query

String

Not applicable when reading from BigQuery.

Quiesce on IL Completion

Boolean

False

Restart Behavior on IL Interruption

Enum

Keep target table data

See Fast Snapshot Recovery during initial load.

Return DateTime As

String

Joda

For BigQuery, this setting is ignored and timestamp values are always returned as Java DateTime.

SSL Config

String

Not applicable to BigQuery in this release.

Tables

String

Specify the table(s) or view(s) to be read. Specify names as <database name>.<table name>. Names are case-sensitive unless BigQuery's is_case_insensitive option is set to TRUE.

You may specify multiple tables and views as a list separated by semicolons or with the % wildcard. For example, HR% would read all the tables whose names start with HR. You may use the % wildcard only for tables, not for schemas or databases. The wildcard is allowed only at the end of the string: for example, mydb.prefix% is valid, but mydb.%suffix is not.

Modifying this property can interfere with recovery. If recovery is enabled for the application and it is necessary to modify the tables list, export the application (see Apps page), drop it, update the exported TQL with the new values, and import it (see Apps page).

Use Connection Profile

Boolean

False

See Connection profiles.

Username

String

Not used when reading from BigQuery.

Vendor Configuration

Striing

Not applicable to BigQuery in this release.

Sample Database Reader WAEvent

For the following row:

id  first_name  last_name  phone  street          city          state  zip_code
1   Deborah     Burks      NULL   9273 Thorne AV  Orchard Park  NY     14127

The WAEvent would be similar to:

WAEvent{
data: [1,"Deborah","Burks",null,"9273 Thorne AV","Orchard Park","NY","14127"]
metadata: {"TableName":"sales.customers","ColumnCount":8,"EntityName":"sales.customers""OperationName":"SELECT",
  "OperationType":"DML","OPERATION_TS":1681412863364}
userdata: null
before: null
dataPresenceBitMap: "AAA="
beforePresenceBitMap: "AAA="
typeUUID: {"uuidstring":"01edda2e-77f7-9b21-83c2-8e859085da65"}
};

The operation name for Database Reader WAEvents is always SELECT.

Database Reader data type support and correspondence for BigQuery sources

BigQuery type

TQL type

ARRAY

java.lang.String

BIGNUMERIC

java.math.BigDecimal

BOOL

java.lang.Boolean

BYTES

java.lang.String

DATE

java.time.LocalDate

DATETIME

java.time.LocalDateTime

FLOAT64

java.lang.Double

GEOGRAPHY

java.lang.String

INT64

java.lang.Long

INTERVAL

java.lang.String

JSON

java.lang.String

NUMERIC

java.math.BigDecimal

RANGE

java.lang.String

STRING

java.lang.String

STRUCT

java.lang.String

TIME

java.lang.String

TIMESTAMP

java.time.ZonedDateTime