Striim vs. StreamSets Feature Comparison

Fully automated, fully managed service with high availability

Striim is a fully automated, fully managed service available on Amazon Web Services, Google Cloud, and Microsoft Azure. Striim can also be deployed with high availability and strict SLAs for uptime and disaster recovery.

StreamSets is self-hosted and requires customers to manage the data processing plane.

Striim

StreamSets

Hybrid cloud and on-premise deployments

Both Striim and StreamSets can be deployed as on-premise and hybrid on-prem to cloud topologies. 

Striim

StreamSets

Exactly Once Data Delivery Guarantee

Striim’s advanced checkpointing capabilities ensure that no events are missed or processed twice.

StreamSets offers a choice of “At Least Once” or “At Most Once” data delivery guarantees which means that data can be either lost or duplicated, depending on which option is chosen.

Striim

StreamSets

High-Performance Change Data Capture (CDC)

Striim supports high-performance, log-based CDC for many popular databases including: Oracle, PostgreSQL, MongoDB, MySQL, HPE Nonstop, and SQL Server. Built by the executive & technical team from GoldenGate Software, Striim brings decades of experience in mission-critical enterprise workloads.

StreamSets offers CDC from popular databases, but they only can offer ‘At Least Once’ or ‘At Most Once’ delivery. 

Striim

StreamSets

Data Observability to meet business SLAs

Striim offers detailed and customizable real-time dashboards and alerts visualizing end-to-end data delivery from source to target. Striim matches source and target transactions and alerts users to missing transactions, making it easy to identify issues as they occur. Striim offers data delivery and latency SLAs. Customers see end-to-end latency under 2 seconds.

StreamSets allows users to monitor jobs with a “Realtime Summary” that includes Record Count, Record Throughput, and Runtime statistics. They provide error messages and logs which can be time-intensive to scan through in the event of data loss or data lag scenarios.

Striim

StreamSets

Custom Alerts

Striim allows custom alerts on data delivery SLAs, data loss, and user-defined rules. Striim’s custom alerts are created using Streaming SQL. 

StreamSets also allows the creation of custom alerts using the StreamSets expression language (based on JSP 2.0 expression language).

Striim

StreamSets

Automated Corrective Actions

Striim users can create custom workflows to perform corrective actions in the event of errors or failures. By tapping into error or status streams users can trigger compensating data flows, or perform other actions to remediate problems.

Striim

StreamSets

Real-Time Data Enrichment

Striim supports data enrichment and normalization using in-memory key-value stores for historic data. This allows users to enrich raw, real-time data with historical aggregates and lookup data.

Striim

StreamSets

Real-time Transformations

Striim users use streaming SQL for in-flight transformations, correlation, aggregation, masking, filtering, and analytics. Striim scales horizontally with in-memory compute for high performance transformations.

StreamSets Transformer leverages Apache Spark to allow users to perform stream processing and machine learning operations. However there is overhead of batch processing.

Striim

StreamSets

Cloud Partnerships

Striim’s cloud partners include Google, Microsoft, AWS, and Snowflake. Striim partners closely with cloud vendors to support a full breadth of endpoints for a variety of strategic use cases. Striim also supports deployment via metered and SaaS marketplace offerings to take advantage of cloud scalability.

StreamSets was acquired by SoftwareAG and subsequently IBM. 

Striim

StreamSets

Data Sources: Cloud + On-Premise
Databases and Data Warehouses

Striim

StreamSets

Data Sources: IoT Devices

Striim

StreamSets

Data Sources: Kafka

Striim

StreamSets

Data Targets: Cloud Data
Warehouses and Databases

Striim

StreamSets

Data Targets: Files and Logs,
Messaging Systems, Big Data

Striim

StreamSets

Striim offers a modern data platform that's both powerful and easy to use

Select from hundreds of templates to simplify building your data flows. A step-by-step wizard will lead you through the process of connecting to your source and target to create a data flow application. You can also create custom data flows from scratch.

4.0 wizards screenshot

Your data flow defines how to collect, process, and deliver data. The simplest data flow just has a source, a stream, and a target. In many cases you will need to perform some processing on your data. Striim enables you to set up continuous SQL queries optimized for streaming, real-time data.

Our built-in dashboards and monitoring enable you to see the state of your data flows in real-time and easily identify any bottlenecks. Striim can also validate that your data has been delivered and provide visibility into the end-to-end lag. This level of visibility is essential for mission-critical systems that may have SLAs regarding how current the data is.

You can also drill down on any of the components in a data flow to see detailed statistics that include read/write rate, lag, latency, CPU usage, and many other metrics. This detailed information can help identify any bottlenecks, and aids in tuning data flows for maximum performance and minimal latency.

Striim shows missing and long running transactions
Striim's alerting feature

Striim allows you to define SQL-based custom alerts so you can stay informed about the status and performance of your data flows.

In the case of errors, or failures, you can also automate workflows to perform corrective actions. By tapping into error or status streams you can trigger compensating data flows to start, or perform other actions to remediate problems.

Striim gives us a single source of truth across domains and speeds our time to market delivering a cohesive experience across different systems.

Neel Chinta, IT Manager at Macy's

Sources

Targets