Skip to main content

Striim for BigQuery Documentation

Set up your PostgreSQL source

You must perform all setup tasks appropriate for your source environment before you can create a pipeline. If any of these tasks are not complete, the corresponding prerequisite checks will fail.

In all environments, make note of the slot name (the examples use striim_slot but you can use any name you wish). When creating a pipeline, provide the slot name on the Additional Settings page.

In all environments, if you plan to have Striim propagate PostgreSQL schema changes to BigQuery, you must create a tracking table in the source database. To create this table, run pg_ddl_setup_410.sql, which you can download from https://github.com/striim/doc-downloads. When creating a pipeline, provide the name of this table on the Additional Settings page.

PostgreSQL setup in Linux or Windows

This will require a reboot, so it should probably be performed during a maintenance window.

  1. Install the wal2json plugin for the operating system of your PostgreSQL host as described in https://github.com/eulerto/wal2json.

  2. Edit postgressql.conf, set the following options, and save the file. The values for max_replication_slots and max_wal_senders may be higher but there must be one of each available for each instance of PostgreSQL Reader. max_wal_senders cannot exceed the value of max_connections.

    wal_level = logical
    max_replication_slots = 1
    max_wal_senders = 1
  3. Edit pg_hba.conf and add the following records, replacing <IP address> with the Striim server's IP address. If you have a multi-node cluster, add a record for each server that will run PostgreSQLReader. Then save the file and restart PostgreSQL.

    local  replication  striim  <IP address>/0  trust
    local  replication  striim                  trust
  4. Restart PostgreSQL.

  5. Enter the following command to create the replication slot (the location of the command may vary but typically is /usr/local/bin in Linux or C:\Program Files\PostgreSQL\<version>\bin\ in Windows.

    pg_recvlogical -d mydb --slot striim_slot --create-slot -P wal2json

    If you plan to use multiple instances of PostgreSQL Reader, create a separate slot for each.

  6. Create a role with the REPLICATION attribute for use by Striim and give it select permission on the schema(s) containing the tables to be read. Replace ****** with a strong password and myschema with the name of your schema.

    CREATE ROLE striim WITH LOGIN PASSWORD '******' REPLICATION;
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO striim;
    

PostgreSQL setup in Amazon Aurora with PostgreSQL compatibility

You must set up replication at the cluster level. This will require a reboot, so it should probably be performed during a maintenance window.

Amazon Aurora supports logical replication for PostgreSQL compatibility options 10.6 and later. Automated backups must be enabled. To set up logical replication, your AWS user account must have the rds_superuser role.

For additional information, see Using PostgreSQL logical replication with Aurora, Replication with Amazon Aurora PostgreSQL, and Using logical replication to replicate managed Amazon RDS for PostgreSQL and Amazon Aurora to self-managed PostgreSQL.

  1. Go to your RDS dashboard, select Parameter groups > Create parameter group.

  2. For the Parameter group family, select the aurora-postgresql item that matches your PostgreSQL compatibility option (for example, for PostgreSQL 11, select aurora-postgresql11).

  3. For Type, select DB Cluster Parameter Group.

  4. For Group Name and Description, enter aurora-logical-decoding, then click Create.

  5. Click aurora-logical-decoding.

  6. Enter logical_ in the Parameters field to filter the list, click Modify, set rds.logical_replication to 1, and click Continue > Apply changes.

  7. In the left column, click Databases, then click the name of your Aurora cluster, click Modify, scroll down to Database options (you may have to expand the Additional configuration section), change DB cluster parameter group to aurora-logical-decoding, then scroll down to the bottom and click Continue.

  8. Select Apply immediately > Modify DB instance. Wait for the cluster's status to change from Modifying to Available, then stop it, wait for the status to change from Stopping to Stopped, then start it.

  9. In PSQL, enter the following command to create the replication slot:

    SELECT pg_create_logical_replication_slot('striim_slot', 'wal2json');
  10. Create a role with the REPLICATION attribute for use by PostgreSQLReader and give it select permission on the schema(s) containing the tables to be read. Replace ****** with a strong password and (if necessary) public with the name of your schema.

    CREATE ROLE striim WITH LOGIN PASSWORD '******';
    GRANT rds_replication TO striim;
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO striim;
    

PostgreSQL setup in Amazon RDS for PostgreSQL

You must set up replication in the master instance. This will require a reboot, so it should probably be performed during a maintenance window.

Amazon RDS supports logical replication only for PostgreSQL version 9.4.9, higher versions of 9.4, and versions 9.5.4 and higher. Thus PostgreSQLReader can not be used with PostgreSQL 9.4 - 9.4.8 or 9.5 - 9.5.3 on Amazon RDS.

For additional information, see Best practices for Amazon RDS PostgreSQL replication and Using logical replication to replicate managed Amazon RDS for PostgreSQL and Amazon Aurora to self-managed PostgreSQL.

  1. Go to your RDS dashboard, select Parameter groups > Create parameter group, enter posstgres-logical-decoding as the Group name and Description, then click Create.

    createParameterGroup.png
  2. Click postgres-logical-decoding.

    parameterGroups.png
  3. Enter logical_ in the Parameters field to filter the list, click Modify, set rds.logical_replication to 1, and click Continue > Apply changes.

    RDSPostgreSQLParameterGroup.png
  4. In the left column, click Databases, then click the name of your database, click Modify, scroll down to Database options (you may have to expand the Additional configuration section), change DB parameter group to postgres-logical-decoding, then scroll down to the bottom and click Continue.

  5. Select Apply immediately > Modify DB instance. Wait for the database's status to change from Modifying to Available, then reboot it and wait for the status to change from Rebooting to Available.

  6. In PSQL, enter the following command to create the replication slot:

    SELECT pg_create_logical_replication_slot('striim_slot', 'wal2json');
  7. Create a role with the REPLICATION attribute for use by PostgreSQLReader and give it select permission on the schema(s) containing the tables to be read. Replace ****** with a strong password and (if necessary) public with the name of your schema.

    CREATE ROLE striim WITH LOGIN PASSWORD '******';
    GRANT rds_replication TO striim;
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO striim;
    

PostgreSQL setup in Azure SQL Managed Instance

Azure Database for PostgreSQL - Hyperscale is not supported because it does not support logical replication.

  1. Set up logical decoding using wal2json:

  2. Get the values for the following properties which you will need to set in Striim:

PostgreSQL setup in Cloud SQL for SQL Server

  1. Set up logical replication as described in Setting up logical replication and decoding.

  2. Get the values for the following properties which you will need to set in Striim: