Making In-Memory Computing Enterprise Grade – Part 2: Architecture

2 Minute Read

This architecture post is the second blog in a six-part series on making In-Memory Computing Enterprise Grade. Read the entire series:

  1. Part 1: overview
  2. Part 2: data architecture
  3. Part 3: scalability
  4. Part 4: reliability
  5. Part 5: security
  6. Part 6: integration

You have to remember that Striim is an end-to-end distributed IMC Platform providing Streaming Integration and Analytics through continuous Ingest, Processing, Enrichment, Analysis, Delivery, Alerting, and Visualization of streaming data. As such, it has a lot of moving parts that need to work together seamlessly to meet these requirements.

Streaming Integration + Streaming Intelligence

The Striim platform can employed for a large variety of different real-time data integration and streaming analytics use cases. This can be as simple as moving database data supporting enterprise applications into a Data Lake, or Kafka in real-time, or much more complex analytics applications such preventing fraud, monitoring infrastructure, or watching customer behavior to improve their experience. In all of these cases, ensuring scalable, reliable and secure processing that integrates into existing resources is paramount.

As we drill down into each of these requirements, it is useful to understand the Distributed In-Memory Architecture that Striim employs to provide this end-to-end streaming integration and analytics platform.

Striim's Distributed In-Memory Architecture

There are seven main components to this architecture, and each needs to be considered when making the platform Enterprise Grade.

Each of these components has unique characteristics and combining them all together to provide Striim is an example of a converged platform. Making any one of these components Enterprise Grade is a tricky enough proposition, but ensuring each of them are, and arranging things such that the interaction between components does not break anything, is much harder.

Data Ingest Integrating data from existing enterprise sources in real-time
High Speed Messaging Moving huge volumes of data between Striim nodes with very low latency
Persistent Messaging Storage of messages for replay in the case of failure
IMDG for Metadata / Control Clustering Striim nodes and distribution of metadata and user actions
IMDG for Context In memory storage of external data for context and denormalization purposes
Processing Continuous queries for processing, enrichment and analytics
Results Storage of results within the platform or in external resources

In subsequent blogs we will drill down into each of the Enterprise Grade requirements, and outline how we designed things so that each of the above components is scalable, reliable, secure and integrates well with the enterprise infrastructure as required.

The architecture part starts at the 5m12s mark: