Continuous Movement of Data In and Out
of Azure HDInsight
Microsoft® Azure® HDInsight® is a fully-managed cloud service on Azure for open source analytics. It enables customers to use popular open-source frameworks such as Hadoop, Spark, Hive, LLAP, Kafka, Storm, R and more in the Azure Cloud environment. Azure HDInsight supports a broad range of use cases including data warehousing, machine learning, and IoT analytics.
With Azure HDInsight, companies can benefit from the comprehensive capabilities of a high-powered and reliable cloud service environment to perform big data analytics with higher developer productivity and lower cost of management. To fully take advantage of this, companies need to be able to move data in and out of HDInsight in real time.
Striim is cloud-hosted platform that enables the continuous movement of data, both in to HDInsight from a wide variety of data sources, and out from HDInsight to a wide variety of data targets.
When Azure customers sign up to use HDInsight to run analytics workloads in the cloud via Hadoop, Kafka, or Spark, they need to set up data flow from their on-premises and other cloud-based data sources to their analytical environment on HDInsight.
Striim provides continuous, real-time data integration to HDInsight from enterprise databases – using low-impact change data capture (CDC) – log files, messaging systems, sensors, and Hadoop solutions.
It offers a secure, reliable, and scalable service for real-time collection, preparation, and movement of unstructured, semi-structured, and structured data into Kafka, Hadoop, and Spark on Azure HDInsight.
While the data is streaming, Striim enables in-flight processing and enrichment before delivering to Kafka, HDFS, HBase, Hive, or Spark. HDInsight customers can store the data in the right format, helping them to accelerate the insight gained from their analytics applications.