Online Enterprise Database Migration to Google Cloud

 

 

Migrate to cloud

Migrating existing workloads to the cloud is an formidable step in the journey of digital transformation for enterprises. Moving an enterprise application from on premises to run in the cloud, or modernizing with the best use of cloud-native technologies, is only part of the challenge. A major part of this task is to move the existing enterprise databases while business continuously operate at full speed.

Pause never

How the data is extracted and loaded into the new cloud environment plays a big role in keeping the business critical systems performant. Particularly for enterprise databases supporting mission-critical applications, avoiding downtime is a must-have requirement during migrations to minimize both the risk and operational disruption.

For business critical applications, the acceptable downtime precipitously approaches zero. All the while, moving large amounts of data, and essential testing of the business critical applications can take days, weeks, or even months.

Keep running your business

The best practice in enterprise database migration, to minimize and even altogether eliminate the downtime, is to use online database migration that keeps the application running.

In the online migration, changes from the enterprise source database are captured non-intrusively as real-time data streams using Change Data Capture (CDC) technology. This capability is available for most major databases, including Oracle, Microsoft SQL Server, HPE NonStop, MySQL, PostgreSQL, MongoDB, and Amazon RDS, but has to be harnessed in the correct way.

In online database migration, first, you initially load the source database to the cloud. Then, any changes in the source database that have happened since you were executing the initial load are applied to the target cloud database continuously from the real-time data stream. The source and target databases will remain up to date until you are ready to completely cut over. You will also have the option to fallback to the source all along, further minimizing risks.

Integrate continuously

Online database migration also provides essential data integration services for the new application development in the cloud. The change delivery can be kept running while you develop and test the new cloud applications. You may even choose to keep the target and source databases in sync indefinitely typically for continuous database replication in hybrid or multi-cloud use cases.

Keep fresh

Once the real-time streaming data pipelines to the cloud are set up, businesses can easily build new applications, and seamlessly adopt new cloud services to get the most operational value from the cloud environment. Real-time streaming is a crucial element in all such data movement use cases, and it can be widely applied to hybrid or multi-cloud architectures, operational machine learning, analytics offloading, large scale cloud analytics, or any other scenario where having up-to-the-second data is essential to the business.

Change Data Capture

Striim, in strategic partnership with Google Cloud, offers online database migrations and real-time hybrid cloud data integration to Google Cloud through non-intrusive Change Data Capture (CDC) technologies. Striim enables real-time continuous data integration from on-premises and other cloud data sources to BigQuery, Cloud Spanner, Cloud SQL for PostgreSQL, for MySQL, and for SQL Server, as well as Cloud Pub/Sub and Cloud Storage as well as other databases running in the Google Cloud.

Replicate to Google Cloud

In addition to data migration, data replication is an important use case as well. In contrast to data migration, data replication continuously replicates data from a source system to a target system “forever” without the intent to shut down the source system.

An example target system in the context of data replication is BigQuery. It is the data analytics platform of choice in Google Cloud. Striim supports continuous data streaming (replication) from an on-premises database to BigQuery in Google Cloud in case the data has to remain on-premises and cannot be migrated. Striim bridges the two worlds and makes Google Cloud data analytics accessible by supporting the hybrid environment.

Transform in flight

Data migration and continuous streaming in many cases transports the data unmodified from the source to the target systems. However, many use cases require data to be transformed to match the target systems, or to enrich and combine data from different sources in order to complement and complete the target data set for increased value and expressiveness in a simple and robust architecture. This method is frequently referred to as Extract Transform Load, or ETL.

Striim provides a very flexible and powerful in-flight transformation and augmentation functionality in order to support use cases that go beyond simple one-time data migration.

More to migrate? Keep replicate!

Enterprises in general have several data migration and online streaming use cases at the same time. Often data migration takes place for some source databases, while data replication is ongoing for others.

A single Striim installation can support several use cases at the same time, reducing the need for management and operational supervision. The Striim platform supports high-volume, high velocity data with built-in validation, security, high-availability, reliability, and scalability as well as backup-driven disaster recovery addressing enterprise requirements and operational excellence.

The following architecture shows an example where migration and online streaming is implemented at the same time. On the left, the database in the Cloud is migrated to the Cloud SQL database on the right. After a successful migration the source database is going to be removed. In addition, the two source databases on the left in an on-premises data center are continuously streamed (replicated) to BigQuery for analytics and Cloud Spanner for in-Cloud processing.

Keep going

In addition, Striim as the data migration technology is implemented in a high-availability configuration. The three servers on Compute Engine form a cluster, and each of the servers is executing in a different zone, making the cluster highly available and protecting the migration and online streaming from zone failures or zone outages.

Accelerate Cloud adoption

As organizations modernize their data infrastructure, integrating mission-critical databases is essential to ensure information is accessible, valuable, and actionable. Striim and Google Cloud’s partnership supports Google customers with a smooth data movement and continuous integration solutions, accelerating Google Cloud adoption and driving business growth.

Learn more

To learn more about the enterprise cloud data integration questions, feel free to reach out to Striim and check out these references:?

Google Cloud Solution Architecture: Architecting database migration and replication using Striim

Blog: Zero downtime database migration and replication to and from Cloud Spanner

Tutorial: Migrating from MySQL to BigQuery for Real-Time Data Analytics

Striim Google Virtual Hands-On Lab: Online Database Migration to Google Cloud using Striim

Self-paced Hands-on Lab: Online Data Migration to Cloud Spanner using Striim

Striim 3.10.1 Further Speeds Cloud Adoption

 

 

We are pleased to announce the general availability of Striim 3.10.1 that includes support for new and enhanced Cloud targets, extends manageability and diagnostics capabilities, and introduces new ease of use features to speed our customers’ cloud adoption. Key Features released in Striim 3.10.1 are directly available through Snowflake Partner Connect to enable rapid movement of enterprise data into Snowflake.

Striim 3.10.1 Focus Areas Including Cloud Adoption

This new release introduces many new features and capabilities, summarized here:

3.10.1 Features Summary

 

Let’s review the key themes and features of this new release, starting with the new and expanded cloud targets

Striim on Snowflake Partner Connect

From Snowflake Partner Connect, customers can launch a trial Striim Cloud instance directly as part of the Snowflake on-boarding process from the Snowflake UI and load data, optionally with change data capture, directly into Snowflake from any of our supported sources. You can read about this in a separate blog.

Expanded Support for Cloud Targets to Further Enhance Cloud Adoption

The Striim platform has been chosen as a standard for our customers’ cloud adoption use-cases partly because of the wide range of cloud targets it supports. Striim provides integration with databases, data warehouses, storage, messaging systems and other technologies across all three major cloud environments.

A major enhancement is the introduction of support for the Google BigQuery Streaming API. This not only enables real-time analytics on large scale data in BigQuery by ensuring that data is available within seconds of its creation, but it also helps with quota issues that can be faced by high volume customers. The integration through the BigQuery streaming API can support data transfer up to 1GB per second.

In addition to this, Striim 3.10.1 also has the following enhancements:

  • Optimized delivery to Snowflake and Azure Synapse that facilitates compacting multiple operations on the same data to a single operation on the target resulting in much lower change volume
  • Delivery to MongoDB cloud and MongoDB API for Azure Cosmos DB
  • Delivery to Apache Cassandra, DataStax Cassandra, and Cassandra API for Azure Cosmos DB

  • Support for delivery of data in Parquet format to Cloud Storage and Cloud Data Lakes to further support cloud analytics environments

Schema Conversion to Simplify Cloud Adoption Workflows

As part of many cloud migration or cloud integration use-cases, especially during the initial phases, developers often need to create target schemas to match those of source data. Striim adds the capability to use source schema information from popular databases such as Oracle, SQL Server, and PostgreSQL and create appropriate target schema in cloud targets such as Google BigQuery, Snowflake and others. Importantly, these conversions understand data type and structure differences between heterogeneous sources and targets and act intelligently to spot problems and inconsistencies before progressing to data movement, simplifying cloud adoption.

Enhanced Monitoring, Alerting and Diagnostics

On-going data movement between on-premise and cloud environments for migrations, or powering reporting and analytics solutions, are often part of an enterprise’s critical applications. As such they demand deep insights into the status of all active data flows.

Striim 3.10.1 adds the capability to inherently monitor data from its creation in the source to successful delivery in a target, generate detailed lag reports, and alert on situations where lag is outside of SLAs.

End to End Lag Visualization

In addition, this release provides detailed status on checkpointing information for recovery and high availability scenarios, with insight into checkpointing history and currency.

Real-time Checkpointing Information

Simplifies Working with Complex Data

As customers work with heterogeneous environments and adopt more complex integration scenarios, they often have to work with complex data types, or perform necessary data conversions. While always possible through user defined functions, this release adds multiple commonly requested data manipulation functions out of the box. This simplifies working with JSON data and document structures, while also facilitating data cleansing, and regular expression operations.

On-Going Support for Enterprise Sources

As customers upgrade their environments, or adopt new technologies, it is essential that their integration platform keeps pace. In Striim 3.10.1 we extend our support for the Oracle database to include Oracle 19c, including change data capture, add support for schema information and metadata for Oracle GoldenGate trails, and certify our support for Hive 3.1.0

These are a high level view of the new features of Striim 3.10.1. There is a lot more to discover to aid on your cloud adoption journey. If you would like to learn more about the new release, please reach out to schedule a demo with a Striim expert.

Implementing Gartner’s Cloud Smart FEVER selection process using Striim

In their recent research note, “Move From Cloud First to Cloud Smart to Improve Cloud Journey Success” (February 2020), Gartner introduced the concept of using the FEVER selection process to prioritize workloads to move to cloud.

According to the research note, to ensure rapid results by building on the knowledge of earlier experiences with cloud, IT leaders “should prioritize the workloads to move to cloud by using a ‘full circle’ continuous loop selection process: faster, easier, valuable, efficient and repeat (FEVER; see Figure 2). This allows them to deliver results in waves of migrations according to the organization’s delivery capacities.”

While thinking about this concept I realized that following this approach is one of the reasons that Striim’s customers are so successful with their cloud migration and integration initiatives.  They are utilizing a cloud smart approach for real-world use-cases, including online database migrations enabled by change data capture, offloading reporting to cloud environments, and continuous data delivery for cloud analytics.

Faster

The speed of solutions is critical to many of our customers that have strict SLAs, and limited timeframes in which they want to complete their projects. Striim allows customers to build and test data flows supporting cloud adoption very quickly, while Striim’s optimized architecture enables rapid transfer of data from data sources to cloud for both initial load, and on-going real-time data delivery.

Easier

Customers don’t want to spend days or weeks learning a new solution. In order to implement quickly, the solution must be easy to learn and work with. Striim’s wizard-based approach and intuitive UI enables our customers to rapidly build out their data pipelines, and transfer knowledge for on-going operations.

Valuable

Many of our customers are already ‘Cloud Smart’ and approach cloud initiatives in a pragmatic way. They often start with highly critical, but simple migrations, that gives them the highest value in the shortest time. Once all the “lowest-hanging fruits” are picked and successfully implemented, they move onto more complex scenarios, or integrate additional sources.

Efficient

Cost-efficiency for our customers is more than just the on-going cost reductions inherent in moving to a cloud solution. It also includes the time taken by their valuable employees to build and maintain the solution, and the data ingress costs inherent in moving their data to the cloud. By utilizing Striim, they can reduce the amount of time spent to achieve success and reduce their data movement costs by utilizing one-time loads, with on-going change delivery.

Repeat

It is seldom that our customers have a single migration, or cloud adoption to perform. Repeatability, and reusability of the cloud migration or integration is essential to their long-term plans. Not only do they want to be able to repeat similar migrations, but they also want to be able to use the same platform for all of their cloud adoption initiatives. By standardizing on Striim, our customers can take advantage of the large numbers of sources and cloud targets we support and focus on the business imperatives without having to worry whether it’s possible.

 

If you would like to learn more about becoming cloud smart, you can access the full report “Move From Cloud First to Cloud Smart to Improve Cloud Journey Success” (February 2020), for a limited time using this link.

 

Move From Cloud First to Cloud Smart to Improve Cloud Journey Success, Henrique Cecci, 25 February 2020

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and is used herein with permission. All rights reserved.

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Striim.

Cloud Adoption: How Streaming Integration Minimizes Risks

 

 

Last week, we hosted a live webinar, Cloud Adoption: How Streaming Integration Minimizes Risks. In just 35 minutes, we discussed how to eliminate database downtime and minimize other risks of cloud migration and ongoing integration for hybrid cloud architecture, including a live demo of Striim’s solution.

Our first speaker, Steve Wilkes, started the presentation discussing the importance of cloud adoption for today’s pandemic-impacted, fragile business environment. He continued with the common risks of cloud data migration and how streaming data integration with low-impact change data capture minimizes both downtime and risks. Our second presenter, Edward Bell, gave us a live demonstration of Striim for zero downtime data migration. In this blog post, you can find my short recap of the key areas of the presentation. This summary certainly cannot do justice to the comprehensive discussion we had at the webinar. That’s why I highly recommend you watch the full webinar on-demand to access details on the solution architecture, its comparison to batch ETL approach, customer examples, the live demonstration, and the interactive Q&A section.

Cloud adoption brings multiple challenges and risks that prevent many businesses from modernizing their business-critical systems.

Limited cloud adoption and modernization reduces the ability to optimize business operations. These challenges and risks include causing downtime and business disruption and losing data during the migration, which are simply not acceptable for critical business systems. The risk list, however, is longer than these two. Switching over to cloud without adequate testing that leads to failures, working with stale data in the cloud, and data security and privacy are also among the key concerns.

Steve emphasized the point that “rushing the testing of the new environment to reduce the downtime, if you cannot continually feed data, can also lead to failures down the line or problems with the application.” Later, he added that “Beyond the migration, how do you continually feed the system? Especially in integration use cases where you are maintaining the data where it was and also delivering somewhere else, you need to continuously refresh the data to prevent staleness.”

Each of these risks mentioned above are preventable with the right approach to data movement between the legacy and new cloud systems.

 

Streaming data integration plays a critical role in successful cloud adoption with minimized risks.

A reliable, secure, and scalable streaming data integration architecture with low-impact change data capture enables zero database downtime and zero data loss during data migration. Because the source system is not interrupted, you can test the new cloud system as long as you need before the switchover. You also have the option to failback to the legacy system after switchover by reversing the data flow and keeping the old system up-to-date with the cloud system until you are fully confident that it is stable.

CDCInitialLoad.png” alt=”” width=”1902″ height=”958″ />

Striim’s cloud data migration solution uses this modern approach. During the bulk load, Striim’s CDC component collects the source database changes in real time. As soon as the initial load is complete, Striim applies the changes to the target environment to maintain the legacy and cloud database consistency. With built-in exactly once processing (E1P), Striim can avoid data both data loss and duplicates. You have the ability to use Striim’s real-time dashboards to monitor the data flow and various detailed performance metrics.

Continuous, streaming data integration for hybrid cloud architecture liberates your data for modernization and business transformation.

Cloud adoption and streaming integration are not limited to the lifting and shifting of your systems to the cloud. Ongoing integration post-migration is a crucial part of planning your cloud adoption. You cannot restrict it to database sources and database targets in the cloud, either. Your data lives in various systems and needs to be shared with different endpoints, such as your storage, data lake, or messaging systems in the cloud environment. Without enabling comprehensive and timely data flow from your enterprise systems to the cloud, what you can achieve in the cloud will be very limited.

“It is all about liberating your data.” Steve added in this part of the presentation. “Making it useful for the purpose you need it for. Continuous delivery in the correct format from a variety of sources relies on being able to filter that data, transform it, and possibly aggregate, join and enrich before you deliver to where needed. All of these can be done in Striim with a SQL-based language.”

A key point both Edward and Steve made is that Striim is very flexible. You can source from multiple sources and send to multiple targets. True data liberation and modernizing your data infrastructure needs that flexibility.

Striim also provides deployment flexibility. In fact, this was a question in the Q&A part, asking about deployment options and pricing. Unfortunately we could not answer all the questions we received. The short answer is: Striim can be deployed in the cloud, on-premises, or both via a hybrid topology. It is priced based on the CPUs of the servers where the Striim platform is installed. So you don’t need to worry about the sizes of your source and target systems.

There is much more covered in this short webinar we hosted on cloud adoption. I invite you to watch it on-demand at your convenience. If you would like to get a customized demo for cloud adoption or other streaming data integration use cases, please feel free to reach out.

A New Comprehensive Guide to Streaming ETL for Google Cloud

 

 

Not to brag, but since we literally wrote the book on data modernization with streaming data integration, it is our pleasure to provide you with a guide book on using streaming ETL for Google Cloud Platform. This eBook will help your company unleash innovative services and solutions by combining the power of streaming data integration with Google Cloud Platform services.

As part of your data modernization and cloud adoption efforts, you cannot ignore how you collect and move your data to your new data management platform. But, like adopting any new technology, there is complexity in the move and a number of things to consider, especially when dealing with mission-critical systems. We realize that the process of researching options, building requirements, getting consensus, and deciding on a streaming ETL for Google Cloud is never a trivial task.

A Buyer's Guide to Streaming Data Integration to Google Cloud PlatformAs a technology partner of Google Cloud, we, at Striim, are thrilled to invite you to easily tap into the power of streaming ETL by way of our new eBook: A Buyer’s Guide to Streaming Data Integration for Google Cloud. If you’ve been looking to move to the Google Cloud or get more operational value in your cloud adoption journey, this eBook is your go-to guide.

This eBook provides an in-depth analysis of the game-changing trends of digital transformation. It explains why a new approach to data integration is required, and how www.striim.com/blog/2020/01/streaming-data-integration-whiteboard-wednesdays/“>streaming data integration (SDI) fits into a modern data architecture. With many use case examples, the eBook shows you how streaming ETL for Google Cloud provides business value, and why this is a foundational step. You’ll discover how this technology is enabling the business innovations of today – from ride sharing and fintech, to same-day delivery and retail/e-retail.

Here’s a rundown of what we hope you’ll learn through this eBook:

  • A clear definition of what streaming integration is, and how it compares and contrasts to traditional extract/transform/load (ETL) tools
  • An understanding of how SDI fits into existing as well as emerging enterprise architectures
  • The role streaming data integration architecture plays in regards to cloud migration, hybrid cloud, multi-cloud, etc.
  • The true business value of adopting SDI
  • What companies and IT professionals should be looking for in a streaming data integration solution, focusing on the value of combining SDI and stream processing in one integrated platform
  • Modern SDI use cases, and how these are helping organizations to transform their business
  • Specifically, the benefits of using the Striim SDI platform in combination with the Google Cloud Platform

The digital business operates in real time, and the limitations of legacy integration approaches will hold you back from the limitless potential that cloud platforms bring to your business. To ease your journey into adopting streaming ETL to Google Cloud, please accept our tested and proven guidance with this new eBook: A Buyer’s Guide to Streaming Data Integration for Google Cloud. By following the practical steps provided for you, you can reap the full benefits of Google Cloud for your enterprise. For further information on streaming data integration or the Striim platform, please feel free to contact us.

MySQL to Google BigQuery using CDC

Tutorial: Migrating from MySQL to BigQuery for Real-Time Data Analytics

 

 

In this post, we will walk through an example of how to replicate and synchronize your data from on-premises MySQL to BigQuery using change data capture (CDC).

Data warehouses have traditionally been on-premises services that required data to be transferred using batch load methods. Ingesting, storing, and manipulating data with cloud data services like Google BigQuery makes the whole process easier and more cost effective, provided that you can get your data in efficiently.

Striim real-time data integration platform allows you to move data in real-time as changes are being recorded using a technology called change data capture. This allows you to build real-time analytics and machine learning capabilities from your on-premises datasets with minimal impact.

Source MySQL Database

Before you set up the Striim platform to synchronize your data from MySQL to BigQuery, let’s take a look at the source database and prepare the corresponding database structure in BigQuery. For this example, I am using a local MySQL database with a simple purchases table to simulate a financial datastore that we want to ingest from MySQL to BigQuery for analytics and reporting.

I’ve loaded a number of initial records into this table and have a script to apply additional records once Striim has been configured to show how it picks up the changes automatically in real time.

Targeting Google BigQuery

You also need to make sure your instance of BigQuery has been set up to mirror the source or the on-premises data structure. There are a few ways to do this, but because you are using a small table structure, you are going to set this up using the Google Cloud Console interface. Open the Google Cloud Console, and select a project, or create a new one. You can now select BigQuery from the available cloud services. Create a new dataset to hold the incoming data from the MySQL database.

Once the dataset has been created, you also need to create a table structure. Striim can perform the transformations while the data flies through the synchronization process. However, to make things a little easier here, I have replicated the same structure as the on-premises data source.

You will also need a service account to allow your Striim application to access BigQuery. Open the service account option through the IAM window in the Google Cloud Console and create a new service account. Give the necessary permissions for the service account by assigning BigQuery Owner and Admin roles and download the service account key to a JSON file.

Set Up the Striim Application

Now you have your data in a table in the on-premises MySQL database and have a corresponding empty table with the same fields in BigQuery. Let’s now set up a Striim application on Google Cloud Platform for the migration service.

Open your Google Cloud Console and open or start a new project. Go to the marketplace and search for Striim. A number of options should return, but the option you are after is the first item that allows integration of real-time data to Google Cloud services.

Select this option and start the deployment process. For this tutorial, you are just using the defaults for the Striim server. In production, you would need to size appropriately depending on your load.

Click the deploy button at the bottom of this screen and start the deployment process.

Once this deployment has finished, the details of the server and the Striim application will be generated.

Before you open the admin site, you will need to add a few files to the Striim Virtual Machine. Open the SSH console to the machine and copy the JSON file with the service account key to a location Striim can access. I used /opt/striim/conf/servicekey.json.

You also need to restart the Striim services for these setting and changes to take effect. The easiest way to do this is to restart the VM.

Give these files the right permissions by running the following commands:

chown striim:striim <filename>

chmod 770 <filename>

You also need to restart the Striim services for this to take effect. The easiest way to do this is to restart the VM.

Once this is done, close the shell and click on the Visit The Site button to open the Striim admin portal.

Before you can use Striim, you will need to configure some basic details. Register your details and enter in the Cluster name (I used “DemoCluster”) and password, as well as an admin password. Leave the license field blank to get a trial license if you don’t have a license, then wait for the installation to finish.

 

When you get to the home screen for Striim, you will see three options. Let’s start by creating an app to connect your on-premises database with BigQuery to perform the initial load of data. To create this application, you will need to start from scratch from the applications area. Give your application a name and you will be presented with a blank canvas.

The first step is to read data from MySQL, so drag a database reader from the sources tab on the left. Double-click on the database reader to set the connection string with a JDBC-style URL using the template:

jdbc:mysql://<server_ip>:<port>/<database>

You must also specify the tables to synchronize — for this example, purchases — as this allows you to restrict what is synchronized.

Finally, create a new output. I called mine PurchasesDataStream.

You also need to connect your BigQuery instance to your source. Drag a BigQuery writer from the targets tab on the left. Double-click on the writer and select the input stream from the previous step and specify the location of the service account key. Finally, map the source and target tables together using the form:

<source-database>.<source-table>,<target-database>.<target-table>

For this use case this is just a single table on each side.

Once both the source and target connectors have been configured, deploy and start the application to begin the initial load process. Once the application is deployed and running, you can use the monitor menu option on the top left of the screen to watch the progress.

Because this example contains a small data load, the initial load application finishes pretty quickly. You can now stop this initial load application and move on to the synchronization.

Updating BigQuery with Change Data Capture

Striim has pushed your current database up into BigQuery, but ideally you want to update this every time the on-premises database changes. This is where the change data capture application comes into play.

Go back to the applications screen in Striim and create a new application from a template. Find and select the MySQL CDC to BigQuery option.

 

Like the first application, you need to configure the details for your on-premises MySQL source. Use the same basic settings as before. However, this time the wizard adds the JDBC component to the connection URL.

When you click Next, Striim will ensure that it can connect to the local source. Striim will retrieve all the tables from the source. Select the tables you want to sync. For this example, it’s just the purchases table.

Once the local tables are mapped, you need to connect to the BigQuery target. Again, you can use the same settings as before by specifying the same service key JSON file, table mapping, and GCP Project ID.

Once the setup of the application is complete, you can deploy and turn on the synchronization application. This will monitor the on-premises database for any changes, then synchronize them into BigQuery.

Let’s see this in action by clicking on the monitor button again and loading some data into your on-premises database. As the data loads, you will see the transactions being processed by Striim.

Next Steps

As you can see, Striim makes it easy for you to synchronize your on-premises data from existing databases, such as MySQL, to BigQuery. By constantly moving your data into BigQuery, you could now start building analytics or machine learning models on top, all with minimal impact to your current systems. You could also start ingesting and normalizing more datasets with Striim to fully take advantage of your data when combined with the power of BigQuery.

To learn more about Striim for Google BigQuery, check out the related product page. Striim is not limited to MySQL to BigQuery integration, and supports many different sources and targets. To see how Striim can help with your move to cloud-based services, schedule a demo with a Striim technologist or download a free trial of the platform.

How to Migrate Oracle Database to Google Cloud SQL for PostgreSQL with Streaming Data Integration

 

 

For those who need to migrate an Oracle database to Google Cloud, the ability to move mission-critical data in real-time between on-premises and cloud environments without either database downtime or data loss data is paramount. In this video Alok Pareek, Founder and EVP of Products at Striim demonstrates how the Striim platform enables Google Cloud users to build streaming data pipelines from their on-premises databases into their Cloud SQL environment with reliability, security, and scalability. The full 8-minute video is available to watch below:

Striim offers an easy-to-use platform that maximizes the value gained from cloud initiatives; including cloud adoption, hybrid cloud data integration, and in-memory stream processing. This demonstration illustrates how Striim feeds real-time data from mission-critical applications from a variety of on-prem and cloud-based sources to Google Cloud without interruption of critical business operations.

Oracle database to Google Cloud

Through different interactive views, Striim users can develop Apps to build data pipelines to Google Cloud, create custom Dashboards to visualize their data, and Preview the Source data as it streams to ensure they’re getting the data they need. For this demonstration, Apps is the starting point from which to build the data pipeline.

There are two critical phases in this zero-downtime data migration scenario. The first involves the initial load of data from the on-premise Oracle database into the Cloud SQL Postgres database. The second is the synchronization phase, achieved through specialized readers to keep the source and target consistent.

Oracle database to Google Cloud
Striim Flow Designer

The pipeline from the source to the target is built using a flow designer that easily creates and modifies streaming data pipelines. The data can also be transformed while in motion, to be realigned or delivered in a different format. Through the interface, the properties of the Oracle database can also be configured – allowing users extensive flexibility in how the data is moved.

Once the application is started, the data can be previewed, and progress monitored. While in-motion, data can be filtered, transformed, aggregated, enriched, and analyzed before delivery. With up-to-the-second visibility of the data pipeline, users can quickly and easily verify the ingestion, processing, and delivery of their streaming data.

Oracle database to Google Cloud

During the time of initial load, the source data in the database is continually changing. Striim keeps the Cloud SQL Postgres database up-to-date with the on-premises Oracle database using change data capture (CDC). By reading the database transactions in the Oracle redo logs, Striim collects the insert, update, and delete operations as soon as the transactions commit, and makes only the changes to the target, This is done without impacting the performance of source systems, while avoiding any outage to the production database.

By generating DML activity using a simulator, the demonstration shows how inserts, updates, and deletes are managed. Running DMLS operations against the orders table, the preview shows not only the data being captured, but also metadata including the transaction ID, the system commit number, the table name, and the operation type. When you log into the orders table, the data is present in the table.

The initial upload of data from the source to the target, followed by change data capture to ensure source and target remain in-sync, allows businesses to move data from on-premises databases into Google Cloud with the peace of mind that there will be no data loss and no interruption of mission-critical applications.

Additional Resources:

To learn more about Striim’s capabilities to support the data integration requirements for a Google hybrid cloud architecture, check out all of Striim’s solutions for Google Cloud Platform.

To read more about real-time data integration, please visit our Real-Time Data Integration solutions page.

To learn more about how Striim can help you migrate Oracle database to Google Cloud, we invite you to schedule a demo with a Striim technologist.

 

What’s New in Striim 3.9.5

What’s New in Striim 3.9.5: More Cloud Integrations; Greater On-Prem Extensibility; Enhanced Manageability

Striim’s development team has been busy, and launched a new release of the platform, Striim 3.9.5, last week. The goal of the release was to enhance the platform’s manageability while boosting its extensibility, both on-premises and in the cloud.

I’d like to give you a quick overview of the new features; starting with expanded cloud integration capabilities.

  • Striim 3.9.5 now offers direct writers for both Azure Data Lake Storage Gen 1 and  Gen 2. This capability allows businesses to stream real-time, pre-processed data to their Azure data lake solutions from enterprise databases, log files, messaging systems such as Kafka, Hadoop, NoSQL, and sensors, deployed on-prem or in the cloud.
  • Striim’s support for Google Pub/Sub is now improved with a direct writer. Google Pub/Sub serves as a messaging service for GCP services and applications. Rapidly building real-time data pipelines into Google Pub/Sub from existing on-prem or cloud sources allows businesses to seamlessly adopt GCP for their critical business operations and achieve the maximum benefit from their cloud solutions.
  • Striim has been providing streaming data integration to Google BigQuery since 2016. With this release, Striim supports additional BigQuery functionalities such as SQL MERGE.
  • Similarly, the new release brings enhancements to Striim’s existing Azure Event Hubs Writer and Amazon Redshift Writer to simplify development and management.

In addition to cloud targets, Striim boosted its heterogeneous sources and destinations for on-premises environments too. The 3.9.5 release includes:

  • Writing to and reading from Apache Kafka version 2.1
  • Real-time data delivery to HPE NonStop SQL/MX
  • Support for compressed data when reading from GoldenGate Trail Files
  • Support for NCLOB columns in log-based change data capture from Oracle databases

Following on to the 3.9 release, Striim 3.9.5 also added a few new features to improve Striim’s ease of use and manageability:

  • Striim’s users can now organize their applications with user-defined groups and see deployment status with color-coded indicators on the UI. This feature increases productivity, especially when there are hundreds of Striim applications running or in the process of being deployed, as many of our customers do.

 

 

 

 

 

 

 

 

 

 

 

 

 

  • New recovery status indicators in Striim 3.9.5 allow users to track when the application is in the replay mode for recovery versus in the forward processing mode after the recovery is completed.
  • Striim’s application management API now allows resuming a crashed application.
  • Last but not least, Striim 3.9.5 offers easier and more detailed monitoring of open transactions in Oracle databases sources.

For a deeper dive into the new features in Striim 3.9.5, please request a customized demo. If you would like to check out any of these features for yourself, we invite you to download a free trial.

Striim - 2019 CODiE Awards - Best iPaaS

Striim Is a 2019 CODiE Awards Finalist for Best iPaaS Solution

Striim is proud to announce that we’ve been recognized by SIIA as a 2019 CODiE Awards Finalist as a Best iPaaS, or Integration Platform as a Service.Striim - 2019 CODiE Awards - Best iPaaS

Why was Striim selected as a Best iPaaS solution? Striim is the only streaming (real-time) data integration platform running in the cloud that is built specifically to support cloud computing.

Real-time data integration is crucial for hybrid and multi-cloud architectures. Striim’s iPaaS solutions for real-time data integration in the cloud brings the agility and cost benefits of the cloud to integration use cases.

Striim enables companies to:

  • Quickly and easily provision streaming data pipelines to deliver real-time data to the cloud, or between cloud services
  • Easily adopt a multi-cloud architecture by seamlessly moving data across different cloud service providers: Azure, AWS, and Google Cloud
  • Offload operational workloads to cloud by moving data in real time and in the desired format
  • Filter, aggregate, transform, and enrich data-in-motion before delivering to the cloud in order to optimize cloud storage
  • Migrate data to the cloud without interrupting business operations
  • Minimize risk of cloud migrations with real-time, built-in cloud migration monitoring to avoid data divergence or data loss
  • Stream data in real time between cloud environments and back to on-premises systems

As one of the best iPaaS solutions, the Striim platform supports all aspects of Cloud integration as it relates to hybrid cloud and multi-cloud deployments.

Striim enables zero-downtime data migration to cloud by performing an initial load, and delivering the changes to the legacy system that occurred during the loading without pausing the source system. To prevent data loss, it validates that all of the data from on-premises sources migrated to the cloud environment.

Striim’s iPaaS solution provides the real-time data pipelines to and from the cloud to enable operational workloads in the cloud with the availability of up-to-date data.

Striim supports multi-cloud architecture by streaming data between different cloud platforms, including Azure, Google and AWS, and other cloud technologies such as Salesforce and Snowflake. If necessary, Striim can also provide real-time data flows between services offered within each of the three cloud platforms.

About Striim for Data IPaaS

Running as a PaaS solution on Microsoft Azure, AWS and Google Cloud Platform, the Striim streaming data integration platform offers real-time data ingestion from on-premises and cloud-based databases (including Oracle, SQL Server, HPE NonStop, PostgreSQL and MySQL), data warehouses (such as Oracle Exadata and Teradata), cloud services (such as AWS RDS and Amazon S3), Salesforce, log files, messaging systems (including Kafka), sensors, and Hadoop solutions.

Striim delivers this data in real time to a wide variety of cloud services (for example, Azure SQL Data Warehouse, Cosmos DB and Event Hubs; Amazon Redshift, S3 and Kinesis; and Google BigQuery, Cloud SQL and Pub/Sub), with in-flight transformations and enrichments.

Users can rapidly provision and deploy integration applications via a click-through interface using Striim’s pre-built templates and pre-configured integrations that are optimized for their cloud endpoints.

To learn more about Striim’s capabilities as one of the best iPaaS solutions, check out our three-part blog series, “Striim for Data iPaaS.”

Real-Time Data Integration to Google Cloud Spanner

Striim Announces Real-Time Data Migration to Google Cloud Spanner

The Striim team has been working closely with Google to deliver an enterprise-grade solution for online data migration to Google Cloud Spanner. We’re happy to announce that it is available in the Google Cloud Marketplace. This PaaS solution facilitates the initial load of data (with exactly once processing and delivery validation), as well as the ongoing, continuous movement of data to Cloud Spanner.Real-Time Migration to Google Cloud Spanner

The real-time data pipelines enabled by Striim from both on-prem and cloud sources are scalable, reliable and high-performance. Cloud Spanner users can further leverage change data capture to replicate data in transactional databases to Cloud Spanner without impacting the source database, or interrupting operations.

Google Cloud Spanner is a cloud-based database system that is ACID compliant, horizontally scalable, and global. Spanner is the database that underlies much of Google’s own data collection, and it has been designed to offer the consistency of a relational database with the scale and performance of a non-relational database.

Migration to Google Cloud Spanner requires a low-latency, low-risk solution to feed mission-critical applications. Striim offers an easy-to-use solution to move data in real time from Oracle, SQL Server, PostgreSQL, MySQL, and HPE NonStop to Cloud Spanner while ensuring zero downtime and zero data loss. Striim is also used for real-time data migration from Kafka, Hadoop, log files, sensors, and NoSQL databases to Cloud Spanner.

While the data is streaming, Striim enables in-flight processing and transformation of the data to maximize usability of the data the instant it lands in Cloud Spanner.

To learn more about Striim’s Real-Time Migration to Google Cloud Spanner, read the related press release, view our Striim for Google Cloud Spanner product page, or provision Striim’s Real-Time Data Integration to Cloud Spanner in the Google Cloud Marketplace.

What is iPaaS for Data?

What is iPaaS for Data?

Organizations can leverage a wide variety of cloud-based services today, and one of the fastest growing offerings is integration platform as a service. But what is iPaaS?

There are two major categories of iPaaS solutions available, focusing on application integration and data integration. Application integration works at the API level, typically involves relatively low volumes of messages, and enables multiple SaaS applications to be woven together.What is iPaaS for Data?

Integration platform as a service for data enables organizations to develop, execute, monitor, and govern integration across disparate data sources and targets, both on-premises and in the cloud, with processing and enrichment of the data as its streaming.

Within the scope of iPaaS for data there are older batch offerings, and more modern real-time streaming solutions. The latter are better suited to the on-demand and continuous way organizations are utilizing cloud resources.

Streaming data iPaaS solutions facilitate integration through intuitive UIs, by providing pre-configured connectors, automated operators, wizards and visualization tools to facilitate creation of data pipelines for real-time integration. With the iPaaS model, companies can develop and deploy the integrations they need without having to install or manage additional hardware or middleware, or acquire specific skills related to data integration. This can result in significant cost savings and accelerated deployment.

This is particularly useful as enterprise-scale cloud adoption becomes more prevalent, and organizations are required to integrate on-premises data and cloud data in real time to serve the company’s analytics and operational needs.

Factors such as increasing awareness of the benefits of iPaaS among enterprises – including reduced cost of ownership and operational optimization – are fueling the growth of the market worldwide.

For example, a report by Markets and Markets notes that the Integration Platform as a Service market is estimated to grow from $528 million in 2016 to nearly $3 billion by 2021, at a compound annual growth rate (CAGR) of 42% during the forecast period.

“The iPaaS market is booming as enterprises [embrace] hybrid and multi-cloud strategies to reduce cost and optimize workload performance” across on-premises and cloud infrastructure, the report says. Organizations around the world are adopting iPaaS and considering the deployment model an important enabler for their future, the study says.

Research firm Gartner, Inc. notes that the enterprise iPaaS market is an increasingly attractive space due to the need for users to integrate multi-cloud data and applications, with various on-premises assets. The firm expects the market to continue to achieve high growth rates over the next several years.

By 2021, enterprise iPaaS will be the largest market segment in application middleware, Gartner says, potentially consuming the traditional software delivery model along the way.

“iPaaS is a key building block for creating platforms that disrupt traditional integration markets, due to a faster time-to-value proposition,” Gartner states.

The Striim platform can be deployed on-premises, but is also available as an iPaaS solution on Microsoft Azure, Google Cloud Platform, and Amazon Web Services. This solution can integrate with on-premise data through a secure agent installation. For more information, we invite you to schedule a demo with one of our lead technologists, or download the Striim platform.

2019 Technology Predictions

19 For 19: Technology Predictions For 2019 and Beyond

Striim’s 2019 Technology Predictions article was originally published on Forbes.

With 2018 out the door, it’s important to take a look at where we’ve been over these past twelve months before we embrace the possibilities of what’s ahead this year. It has been a 2019 Technology Predictionsfast-moving year in enterprise technology. Modern data management has been a primary objective for most enterprise companies in 2018, evidenced by the dramatic increase in cloud adoption, strategic mergers and acquisitions and the rise of artificial intelligence (AI) and other emerging technologies.

Continuing on from my predictions for 2018, let’s take out the crystal ball and imagine what could be happening technology-wise in 2016.

2019 Technology Predictions for Cloud

• The center of gravity for enterprise data centers will shift faster towards cloud as enterprise companies continue to expand their reliance on the cloud for more critical, high-value workloads, especially for cloud-bursting and analytics applications.

• Technologies that enable real-time data distribution between different cloud and on-premises systems will become increasingly important for almost all cloud use-cases.

• With the acquisition of Red Hat, IBM may not directly challenge the top providers but will play an essential role through the use of Red Hat technologies across these clouds, private clouds and on-premise data centers in increasingly hybrid models.

• Portable applications and serverless computing will accelerate the move to multi-cloud and hybrid models utilizing containers, Kubernetes, cloud and multi-cloud management, with more and more automation provided by a growing number of startups and established players.

• As more open-source technologies mature in the big data and analytics space, they will be turned into scalable managed cloud services, cannibalizing the revenue of commercial companies built to support them.

2019 Technology Predictions for Big Data

• Despite consolidation in the big data space, as evidenced by the Cloudera/Hortonworks merger, enterprise investment in big data infrastructure will wane as more companies move to the cloud for storage and analytics. (Full disclosure: Cloudera is a partner of Striim.)

• As 5G begins to make its way to market, data will be generated at even faster speeds, requiring enterprise companies to seriously consider modernizing their architecture to work natively with streaming data and in-memory processing.

• Lambda and Kappa architectures combining streaming and batch processing and analytics will continue to grow in popularity driven by technologies that can work with both real-time and long-term storage sources and targets. Such mixed-use architectures will be essential in driving machine learning operationalization.

• Data processing components of streaming and batch big data analytics will widely adopt variants of the SQL language to enable self-service processing and analytics by users that best know the data, rather than developers that use APIs.

• As more organizations operate in real time, fast, scalable SQL-based architectures like Snowflake and Apache Kudu will become more popular than traditional big data environments, driven by the need for continual up-to-date information.

2019 Technology Predictions for Machine Learning/Artificial Intelligence

• AI and machine learning will no longer be considered a specialty and will permeate business on a deeper level. By adopting centralized cross-functional AI departments, organizations will be able to produce, share and reuse AI models and solutions to realize rapid return on investment (ROI).

• The biggest benefits of AI will be achieved through integration of machine learning models with other essential new technologies. The convergence of AI with internet of things (IoT), blockchain and cloud investments will provide the greatest synergies with ground-breaking results.

• Data scientists will become part of DevOps in order to achieve rapid machine learning operationalization. Instead of being handed raw data, data scientists will move upstream and work with IT specialists to determine how to source, process and model data. This will enable models to be quickly integrated with real-time data flows, as well as continually evaluating, testing and updating models to ensure efficacy.

2019 Technology Predictions for Security

• The nature of threats will shift from many small actors to larger stronger, possibly state-sponsored adversaries, with industrial rather than consumer data being the target. The sophistication of these attacks will require more comprehensive real-time threat detection integrated with AI to adapt to ever-changing approaches.

• As more organizations move to cloud analytics, security and regulatory requirements will drastically increase the need for in-flight masking, obfuscation and encryption technologies, especially around PII and other sensitive information.

2019 Technology Predictions for IoT

• IoT, especially sensors coupled with location data, will undergo extreme growth, but will not be purchased directly by major enterprises. Instead, device makers and supporting real-time processing technologies will be combined by integrators using edge processing and cloud-based systems to provide complete IoT-based solutions across multiple industries.

• The increased variety of IoT devices, gateways and supporting technologies will lead to standardization efforts around protocols, data collection, formatting, canonical models and security requirements.

2019 Technology Predictions for Blockchain

• The adoption of blockchain-based digital ledger technologies will become more widespread, driven by easy-to-operate and manage cloud offerings in Amazon Web Services (AWS) and Azure. This will provide enterprises a way to rapidly prototype supply chain and digital contract implementations. (Full disclosure: AWS and Azure are partners of Striim.)

• Innovative new secure algorithms, coupled with computing power advances, will speed up the processing time of digital ledger transactions from seconds to milliseconds or microseconds in the next few years, enabling high-velocity streaming applications to work with blockchain.

Whether or not any of these 2019 technology predictions come to pass, we can be sure this year will bring a mix of steady movement towards enterprise modernization, continued investment in cloud, streaming architecture and machine learning, and a smattering of unexpected twists and new innovations that will enable enterprises to think — and act — nimbly.

Any thoughts or feedback on my 2019 technology predictions? Please share on Steve’s LinkedIn page: https://www.linkedin.com/in/stevewilkes/  For more information on Striim’s solutions in the areas Cloud, Big Data, Security and IoT, please visit our Solutions page, or schedule a brief demo with one of our lead technologists.