Real-Time Data Stories Powering Gen AI & Large Language Models (LLM)

5 Minute Read

Striim may be pronounced stream, however the way Striim streams data is more than just classic streaming. ‘Striiming’ data ensures that the anytime, ever-changing story from data, is perpetually told. This story, is “Real-Time”.


In 1689 Newton, in his 3rd law, observed that every action has a reaction. Apple falling on head = Ouch! Genius. Yet over 400 years later time delays are still endured between business related actions, and a business’s reaction.  

Jump forward to Henry Ford 1933: he reduced the time to make a car from 12 to 1.5 hours on the production line. “Batch” production was valuable then, however today it is the dated principle of ‘Batch processing of data’ that holds back many businesses from being served with essential real time business critical insights. And this is at the expense of business results, not to mention the increasing cost of trying to make batch processing run quicker. Nobody likes finding out what is going on, by the time that it is too late to do anything about it. Ouch.


Life happens in real time. Business customers and consumers expect organizations to respond, react and manage business in real-time. Whether this is across OmniChannel consumer environments, supply chain decision making, medical life-critical decisions, responding to changes in the weather, exchange rates, stock prices, whims, needs and wants, power outages etc…  All these are examples of where the real-time processing of data can save and enrich lives and be the power-house for the AI automation and LLM that is transforming business operations. 


By LLM we of course mean Large Language Models, you know, where you get a human like response to a human like question from a machine. A world where the machines learn (ML) and get smarter at making predictions that help us. And this, along with Chat GPT and NLP (Natural Language Processing) all comes under the generic banner of Generative AI. The modern day evolution of good old Artificial Intelligence (AI).


Many of us have experienced the real-time effect like when targeted offers find us immediately on our devices just seconds after we utter a product word. However, not many people are aware of the underlying advances in data-streaming that can power these instant and accurately targeted outcomes. Well, it is something to do with continuous, real-time, simultaneous homogenisation of petabytes of data, from numerous sources that are then modeled to drive automated reactions at the right (Real) time due to clever algorithms. Or it’s pretty much that. Actions happen, and a real-time appropriate reaction can be generated. Real-time intelligent streaming of data is allowing Generative AI to enact Newton’s 3rd law: Action, reaction.


The old IT world is still rife with the “Batch” processing of data operations which entails multiple individual data source transfers to usually one cloud or ‘Data-Lake’ where different treatments are then applied, one after the other, to clean, dedupe, curate, govern, merge, wrangle and scrub this data as best as possible to make it passable for AI. Often in a mysterious inexplicable way where dubious data can create “Hallucinated” results or impose a bias. Too many people suffer from the delusion that ‘If a result appears on a pretty dashboard, it must be true’. Rather than the reality of ‘garbage in, garbage out’.  


Streaming is not new, Striiming data is. Streaming is the continuous flow of data from a source to a target. It is fast and can ingest from databases via Change Data Capture (CDC). The difference with “Intelligent Streaming” or “Striiming” data is that multiple sources of data are extracted via CDC, intelligently integrated (ii) and Striimed simultaneously. And… in the same split-second the data is cleaned, cross-connected and correlated with data science models applied to the data in transit. It arrives in the flexible cloud environment Action-Ready. AI-Ready. That can help explain how when things happen in the real world, there can be a real-time response. Action-reaction. Genius!

It is the rocket fuel for agile cloud environments like: Snowflake & Snowpark, Google BigQuery & Vertex AI, Databricks, Microsoft Azure ML, Sagemaker in AWS and DataScience platforms like Dataiku 


Large Language Models are like a professional business version of Chat GPT. LLM is not just a trawl and regurgitation of the internet. The difference is that LLM is anchored to vast sets of real organizational data that is defined, structured and can be openly challenged for provenance, legacy and validity. Striiming data can differentiate in these LLM contexts due to the ability to access huge, new, vast volumes of historic or new real-time generated data. And this reveals the new true stories that human brain power could never fathom, and batch processing struggles to cater for. This allows predictions and actions to be output in seconds after events occur.


The word ‘Online’ here is not like ‘Being Online” it refers more to a fresh-feedback fashion of ML. Continuously Striiming data is a continuous real-time enabler of Online Machine Learning which is a superior form of model training that perpetually provides new fresh continuously Striimed data for the training models. As opposed to normal ML which trains from an initial, static data set. 

Online ML facilitates significantly higher rates of prediction and accuracy from the machines and can begin to explain the breath-taking speed and accuracy for how answers to questions appear to us in perfectly articulated words and figures as the outputs of LLM.


Before Striiming, it was thought too complex to interrogate vast oceans of deeply tangled and submerged data. Not so now. The in-memory compute power of Striim and its Striiming approach can now add the relevance of this historic data and apply its story and meaning concatenated with other relevancy-selected split-second continuously-changing current-day data from live-happening events and actions. Hence Online ML served by Striiming data can yield better forecast and predicted results and reactions.


Well, saving lives for one. But let’s take a look at some other real life scenarios. Picture hundreds of cameras at an airport capturing gigabytes of intel on files. How many people, where, how many suitcases, size, drivers, pilots, engines, parts, fuel, brakes, fails, fix, stock, threats, tourists, delays – this Airport Ballet plays out every day. Seemingly unrelated scenarios, actions, reactions, and stories being captured and recorded within yottabytes of file data.  The cameras capture patterns and meaning way beyond human comprehension yet the story, in context of other cross-referenced real-time data, is of huge importance to those that can extract the meaning from the data on a real-time basis. So what? Well, It means getting staff, passengers, luggage and parts to the right place at the right time and for at least one airline, it ensures 100s more planes take off, safely and on-time, saving an estimated $1m dollars each time a plane is now not delayed waiting for a part or a person.


Humans are getting smarter, Data Science expertise grows at an impressive rate – but arguably what is fuelling the greatest impact on LLM and Gen AI is the speed and quality of data prepared ready-made for the new clever models and algorithms and ML recipes. Sure, AI teaches itself from legacy and new data oceans. But remember, the humans are the creators of these new data Striiming methods and the models that yield the results. Humans have learnt to think like computers (Actions). So no wonder the computers seem to be thinking more like humans (Reactions).


In my humble and bias opinion, this is now finally the best evidence, application and observation of Newton’s 3rd law of Motion within Generative AI. Actions and instant AI reactions for large enterprises. Solving the old problems in a new, real-time way: saving lives and money, making money and mitigating risk. The same problems we ever had. Only now, Striiming solutions by virtue of CDC and  “ii” (Intelligent integration) is certainly a next generation powerful way to solve them.

Remember. Don’t stream data when you can Striim data.  

Aye Aye…  (ii). Roger over and out. Email me