Moving Real-Time Data to Azure Cosmos DB with Striim

In this video you will see how Striim can help feed Cosmos DB in real-time through our wizard-based UI and intuitive data pipelines.

Azure Cosmos DB is Microsoft’s globally distributed, multi-model database service. You have chosen Cosmos DB to store ever-increasing volumes of data and make this data available in milliseconds. However, most of your source data resides elsewhere – in a wide variety of on-premise or cloud sources. How do you continually move this data to Cosmos DB in real-time, so that your fast analytics and insights are reporting on timely data?


Video Transcription:

Azure Cosmos DB was built to achieve low latency and high availability in a globally distributed world. By elastically and independently scaling throughput and storage across multiple Azure regions world-wide you can access your data when and where you want. And support for multiple models means you can use SQL, Cassandra, MongoDB and other APIs to get to your data.

However, residing in the cloud means you have to determine how to move your existing data to Cosmos DB. This could be migrating an existing SQL Server, Oracle, MySQL, or PostgreSQL operational database, or continually populating Cosmos DB with newly generated on-premise data from logs, or device information. In order for Cosmos DB to provide up-to-date information, there should be as little latency as possible between the original data creation and its delivery to the cloud.

The Striim platform can help with all these requirements and more. Our database adapters support change data capture, or CDC from enterprise or cloud databases. CDC directly intercepts database activity and collects all the inserts, updates and deletes as they happen, ready to stream into Cosmos DB. Adapters for machine logs and other files read at the end of multiple files in parallel to stream out data as it is written, removing the inherent latency of batch. While data from devices and messaging systems can be collected easily, independent of its format, through a variety of high speed adapters and parsers.

After being collected continuously, the streaming data can be delivered directly into Azure Cosmos DB with very low latency, or pushed through a data pipeline where it can be pre-processed through filtering, transformation, enrichment, and correlation using SQL-based queries, before delivery into CosmosDB. This enables such things as data denormalization, change detection, deduplication, and quality checking before the data is ever stored.

In addition to this, because Striim is an enterprise grade platform, it can scale with Cosmos DB and reliably guarantee delivery of source data while also providing built-in dashboards and verification of data pipelines for operational monitoring purposes.

The Striim wizard-based UI enables users to rapidly create a new data flow to move data to Cosmos DB. In this example, real-time change data from Oracle is being continually delivered to Cosmos DB through the SQL API. The wizard walks you through all the configuration steps, checking that everything is set up properly, and results in a data flow application. This data flow can be enhanced to filter, transform and enrich the data through SQL based queries. Here we are adding a name and email address from a cache, based on an ID present in the original data.

When the application is started, data will begin flowing in real-time from Oracle to Cosmos DB. Making changes in Oracle results in the transformed data being written continually to Cosmos DB, as you can see through the Cosmos DB data explorer UI.

Of course, we are not limited to writing through the SQL API. In this example, we are writing Oracle data to a Cassandra model, which can be utilized directly by existing or new Cassandra applications. Here’s what the data looks like in this case.

Striim and Cosmos DB can change the way you do analytics, with Cosmos DB providing global rapid access to the real-time data provided by Striim. The globally distributed cloud database service needs data delivered to the cloud, and Striim can continually feed Cosmos DB with the data you need to run your business.

Try Striim and Cosmos DB today through the Striim for Real-Time Data Integration to Cosmos DB offering on the Azure Marketplace, to see your data how, where, and when you want it.