Skip to main content

BigQuery Writer

BigQuery is a cloud-based data warehousing and analytics platform developed by Google. It allows users to store, analyze, and query large datasets in a fast and scalable way using SQL-like queries.

You can use Striim’s BigQuery Writer to write data from transactional databases such as Oracle and SQL Server, applications such as Salesforce and ServiceNow, NoSQL databases such as Cosmos DB and MongoDB, object stores such as Amazon S3 and Google Cloud Storage, and other supported sources into BigQuery. BigQuery Writer operates with low latency to support real-time analytics and data processing, business intelligence, machine learning, and other tasks.

BigQuery Writer summary

Supported sources

BigQuery Writer can write data from all sources supported by Striim.

Authentication

BigQuery Writer authenticates its connection to BigQuery using a Google service account key. The adapter uses OAuth2 for authorization and TLS 1.2 to encrypt the connection.

Supported writing methods

BigQuery Writer supports three writing methods, each of which uses a different Google API:

Supported write modes

BigQuery Writer supports two write modes:

  • Merge: Records inserted, updated, or deleted from the source database(s) are inserted, updated, or deleted in BigQuery, so the data in BigQuery duplicates the data in the source database(s).

  • Append Only: Insert, update, and delete operations in the source database(s) are all treated as inserts in BigQuery. Thus, you can use BigQuery to query old data that no longer exists in the source database(s), for example, for month-over-month or year-over-year reports.

Additional writing features

  • Supports auto-quiesce after an initial load from Cosmos DB Reader, Database Reader, Mongo Cosmos DB Reader, or MongoDB Reader.

  • Supports schema evolution to detect and propagate DDL changes from supported sources to the BigQuery tables.

Resilience and recovery

  • Supports connection retry to avoid application halting due to transient connection issues.

  • Supports recovery with at-least-once processing (see Recovering applications).

Performance

Supports parallel threads (see Creating multiple writer instances) to increase throughput to the target.

Programmability

  • Flow Designer

  • TQL

  • wizards in the web UI to create applications from the following sources:

    • Initial load with Auto Schema Conversion (using Database Reader) from MariaDB, MySQL, Oracle, PostgreSQL, or SQL Server

    • CDC from MariaDB, MySQL, Oracle, PostgreSQL, or SQL Server

    • Amazon S3

    • HDFS

    • Incremental Batch Reader

    • Salesforce

    • ServiceNow

Metrics and auditing

Key metrics are available through Striim's monitoring features (see Monitoring Guide).

Java client version

BigQuery Writer uses Google Cloud BigQuery Client for Java (google-cloud-bigquery) version 2.3.3.

Notes on BigQuery terminology

Some of BigQuery's terms have different meanings than they do in the context of popular SQL databases.

  • Project: contains one or more datasets, similar to the way an Oracle CDB contains one or more databases.

  • Dataset: contains one or more tables, similar to an Oracle, PostgreSQL, or SQL Server schema.

  • Schema: defines column names and data types for a table, similar to a CREATE TABLE DDL statement in SQL.

  • Table: equivalent to a table in SQL.