Resources > Tech Guides

Replicating Your On-Premises Database into Google Cloud SQL for MySQL

 

Introduction

Companies of all sizes can leverage the scale, simplicity, and security of deploying their data infrastructure on the Google Cloud Platform (GCP). GCP offers a full suite of data management products such as Google BigQuery for scalable data warehousing, Cloud SQL, Cloud Spanner, and Cloud Pub/Sub.

In this technology guide, we will cover database replication to Google Cloud SQL. Google Cloud SQL is a scalable offering of two of the most popular and proven enterprise databases – MySQL and PostgreSQL. It allows companies to deploy a fully managed, low maintenance database in Google Cloud.

We will perform the replication using Striim. Striim provides the next generation solution for data integration that continuously collects, processes and moves enterprise data in real time. Unlike traditional batch and scheduled based ETL solutions, Striim’s solution continuously integrates data from on-premises and cloud data sources into Google Cloud SQL.

Customers are using Striim for streaming large volumes of data from a variety of sources including from OLTP databases with very large schemas containing 100-1000s of tables into Google Cloud SQL.

Objective

The key objective of this tech guide is to illustrate the steps required to set up continuous, real-time replication from an existing Oracle database into Google Cloud SQL for MySQL using Striim.

Striim Solutions for GCP

Striim offers the following solutions on the GCP marketplace:

  • Real-Time Integration to Cloud SQL for MySQL
  • Real-Time Integration to Cloud SQL for PostgreSQL
  • Real-Time Integration to Spanner
  • Real-Time Integration to BigQuery
  • Real-Time Integration to Google Cloud Storage

Striim’s solutions enable rapidly developing applications on the GCP that can be configured to continually replicate data in real time from on-premises databases into appropriate Google cloud databases.

This tech note focuses on the steps required for successfully implementing the solution for replicating data from on-premises Oracle DB into Google Cloud SQL for MySQL.

Striim Solution Development and Deployment Options

Striim provides a rich web-based UI for developing, testing and deploying integration solutions. Striim offers flexible deployment options when integrating between on-premises and the cloud. Depending upon the use case requirements, which includes resource availability, integration topology, security restrictions, etc., appropriate option for deploying the solution can be chosen.  The options for deployment include:

  • Develop and deploy solutions entirely on the cloud
  • Develop and deploy entirely on-premises
  • Develop on the cloud and deploy portions of the solution in a hybrid on-premise and cloud topology.

The following diagram illustrates a simple Striim Application that enables continuous, real-time replication between an Oracle database and Cloud SQL for MySQL. Striim’s unique features allow the users to develop applications on the cloud and deploy them on-premises and in the cloud.


Steps for Replicating from On-Premises Oracle Database to Google Cloud SQL for MySQL

Select or create Google Cloud Platform Project and associate the Striim solution and Cloud SQL for MySQL services with the project.

Launch and Configure MySQL Service

Select Cloud SQL for MySQL and launch it and ensure it starts successfully.
Navigate to the MySQL instance and

  • Note down the private IP address and port (default = 3306), database username and password. This is required when configuring the MySQL adapter in the Striim application for accessing the MySQL database via JDBC.
  • Note down the VNC details. This is required to configure the Striim service network so it can access the MySQL service.Prepare MySQL database – Create the database and tables that are required when replicating the data from source Oracle database.

Prepare MySQL database – Create the following table, which is used by Striim application for the purposes of ensuring no data loss and duplicates in the event of application recovery due to failures (e.g. network connectivity etc.).

Create the following table, which is used by Striim application for the purposes of ensuring no data loss and duplicates in the event of application recovery due to failures (e.g. network connectivity etc.).


Launch and Configure Solution “Real-Time Integration to Cloud SQL for MySQL”

Select Striim solutions from the marketplace and launch it and ensure that it starts successfully.

  • Prior to launching, ensure the Striim solution “Real-Time Integration to Cloud SQL for MySQL” instance and the Cloud SQL (MySQL) instance are on the same sub-net or has a default route or peering route set up if they are on different sub-nets.
  • Navigate to the Deployment manager and ensure the service has started successfully. Click on the site address and login into the Striim solution with the provided admin username and password.
  • Striim configures the firewall rules that enables communication with the Striim agent running on-premises (configuring agents is discussed below).
  • Installing MySQL JDBC driver
    Due to licensing restrictions, MySQL JDBC driver is not included with the Striim solution. The following steps highlights the details for installing the JDBC driver:

    • Click on SSH from the above view and login into the Striim solution VM using the browser or cloud shell.
    • Stop the Striim service
      sudo systemctl stop striim-node
      sudo systemctl stop striim-dbms
    • Install the MySQL JDBC driver
      • Download the Connector/J 5.1.46 package from https://dev.mysql.com/downloads/connector/j/5.1.html and extract it.
      • Copy mysql-connector-java-5.1.46.jar to /opt/striim/lib.
    • Start the Striim service
      sudo systemctl stop striim-node
      sudo systemctl stop striim-dbms
    • Get the Striim cluster name and encrypted password
      • Navigate to /opt/Striim/conf/startUp.properties.
      • Note down the values for “WAClusterName” and “WAClusterPassword” (this is required for configuring the agent discussed below).
      • Note down the public IP address of the Striim Solution VM.

Configuring Striim Agent and Oracle DB in the On-Premises Environment

Configuring Striim Agent

Striim agents are light-weight process that run on-premises close to the sources. Agents execute the on-premises flow that extract data and publish events to the cloud. Striim’s agent architecture enables high performance and supports the desired secure connectivity from on-premises into the cloud.

  • Login into Striim solution on GCP (as discussed above).
  • From the Quick Links, download the agent on to your on-premises machine.
  • Unzip the agent into an appropriate directory
  • Configure the agent
    Navigate to <agent directory>/conf/ and modify the conf as follows.

    Parameter name Value
    Striim.cluster.clustername Enter value of WAClusterName
    Striim.cluster.password Enter value of WAClusterPassword
    Striim.node.servernode.address Enter value of public IP address of Striim VM
  • Install Oracle JDBC driver
    • Download ojdbc8.jar from oracle.com.
    • Save that file in <agent directory>/lib.
  • Start the agent
    • <agent directory>/bin/agent.sh
  • Verify the agent server connectivity
    • Login into Striim solution on GCP (as discussed above). In the UI, navigate to “Monitor” overview and scroll down and check the agent status and valid values (>0) for “used-mem” and “tot mem.”


Configuring Oracle DB for CDC

The following tasks must be performed regardless of which Oracle version or variation you are using. These tasks are essential for extracting transactions in real-time from an Oracle database.

Enabling archivelog

The following steps check whether archivelog is enabled and, if not, enable it.

  • Log in to SQL*Plus as the sys user and do the following:
    • select log_mode from v$database;
      If the command returns ARCHIVELOG, it is enabled. Skip ahead to Enabling supplemental log data.
    • If the command returns NOARCHIVELOG, enter:
      shutdown immediate.
    • Wait for the message ORACLE instance shut down, then
      enter: startup mount.
    • Wait for the message Database mounted, then enter:
      alter database archivelog;
      alter database open;
    • To verify that archivelog has been enabled, enter:
      select log_mode from v$database; again. This time it should return ARCHIVELOG.

Enabling supplemental log data

The following steps check whether supplemental log data is enabled and, if not, enable it.

  • Enter the following command:
    select supplemental_log_data_min, supplemental_log_data_pk from v$database;
    If the command returns YES or IMPLICIT, supplemental log data is already enabled. For example,

    SUPPLEME SUP
    ————— ——
    YES NO

    indicates that supplemental log data is enabled, but primary key logging is not. If it returns anything else, enter:
    alter database add supplemental log data;

  • To enable primary key logging for all tables in the database enter:
    alter database add supplemental log data (primary key) columns;
  • Alternatively, to enable primary key logging only for selected tables (do not use this approach if you plan to use wildcards in the OracleReader Tables property to capture change data from new tables):
    alter table <schema name>.<table name> add supplemental log data (primary key) columns;
    To activate your changes, enter:
    alter system switch logfile;

Note down the IP address and port (default = 1521), database username and password. This is required when configuring the Oracle database adapter in the Striim application for accessing the Oracle database via JDBC.

Ensure the firewall on-premises has the rule to allow request from public IP address of Striim VM.

Configuring Application – Adapters

Login into Striim solution on GCP (as discussed above).

Navigate to the Oracle DB to MySQL application.

Configure Oracle DB CDC adapter
Test the connection.

Configure the MySQL adapter

Execute the application.

Configuring Network Connectivity

The network connectivity configuration between on-premises and the cloud discussed above is adequate for a proof of concept, however, in production scenarios typically the on-premises and GCP will be connected via VPN.  In this scenario the agent, Striim server configuration and firewall rules needs to be configured as follows.

Striim agent is deployed on-premises. Striim server is deployed on a cloud instance on a private sub-net. On-premises and cloud private sub-net are connected via VPN tunnel.

Agent side configuration – Agent.conf

Parameter name Value
Striim.cluster.clustername “enter your cluster name”
Striim.cluster.password “enter your encrypted password”
Striim.node.servernode.address Striim.customername.com

 

Configure:

/etc/hosts

10.99.99.99   Striim.customername.com

$ nslookup Striim.customername.com

Name: Striim.customername.com

Address: 10.99.99.99

Server (cloud) side configuration – startUp.properties

Parameter name Value
ServerFqdn Striim.customername.com

Configure:

/etc/hosts

10.99.99.99   Striim.customername.com

$ nslookup Striim.customername.com

Name: Striim.customername.com

Address: 10.99.99.99

Verify the agent and server connectivity as discussed above.

Conclusion

In this technical guide, we introduced the Striim real-time data replication solution for Google Cloud SQL, enabling businesses to set up streaming data pipelines and online database migration solutions. We touched upon Striim’s solution architecture, illustrated sample use cases, and briefly highlighted the capabilities of Striim platform

We invite you to test drive Striim’s solution for Google Cloud, either by downloading the platform from www.striiim.com, or provisioning the Striim PaaS solutions in Google Cloud Marketplace.