Build vs. Buy for Streaming Data Integration: Whiteboard Wednesdays

6 Minute Read

In this Whiteboard Wednesday video, Steve Wilkes, founder and CTO of Striim, asks the question, “Is it better to build or buy a Streaming Data Integration platform?”  Read on, or watch the 10-minute video:

You want to use streaming data integration to move data from existing sources, that generate lots of data, to targets. But is it better to build a streaming data integration platform from lots of readily available open source components, or buy one and have that work done for me?


First, you need all the components that we’ve mentioned in previous videos that go into making a streaming data integration platform. You need to build data collectors and manage data delivery – which is not just one single thing. You need many adapters that support databases, files, maybe IOT cloud targets, essentially a whole bunch of different technologies to get the data from where it is now to where you want it to be.

You need to think about data movement? What messaging system do I use. Do I use multiple ways of doing data movement? How am I going to do stream processing? What is the engine for that? How do I define it? Is there a UI in open source? Not everything has a UI. How do I define the stream processing? Does it support SQL, which in a previous video we mentioned is the best approach working with streaming data. How do I enrich data? Does the platform include an in-memory data grid or other mechanism by which I can enrich data at high speed? How do I ensure that all of this is enterprise grade? Is it scalable, reliable, and secure? How do I get insights and alerts? How do I view the data flows and understand what’s going on if I’m piecing this together myself? This is essential for a lot of operational reasons.

Build vs. Buy for Streaming Data Integration

So, assume that you are going to try and build this from lots of different open source pieces. For every piece that goes into making up this platform, including all the framework pieces and the adapters that you will need on either end, you need to go through a process. This process involves designing the overall platform – how do all the pieces fit together. Then for each piece you need to look at what is available, and evaluate each option, not just on their own, but how they fit in with the other pieces that you are looking at. Once you have done that for each piece, you need to try and integrate all the pieces together. That integration involves building a lot of the glue code on top of the pieces that you have chosen to abstract it so that it is easier for people to work with and ensure that it is enterprise grade; it scales together, it is reliable, and it is secure. Assuming that you have built all of this, you can then start to build your applications. 


Now when it comes to maintenance, open source isn’t always maintained forever. People can stop supporting it. In that case you have to identify another piece of open source that can take its place and will fit in. Then you have to integrate again. Even if it’s just upgraded, that could mean the APIs have changed or the way that it functions has changed. You’re going to have to perform the integration and test everything again to make sure that everything is working. Once that is complete you are going to have to test your applications on the platform that you’ve just built again and make sure that’s working too.


If you have issues, you need to go to support. Some open source platforms offer support, some don’t. There may be pieces that you’re supporting yourself or because you’ve chosen to put multiple things together. In the case of Change Data Capture (CDC) there may be multiple vendors providing support. That becomes a headache because different organizations tend to like to blame each other. You need to have support for all the pieces. If you need to upgrade, then you need to reintegrate again. Maintenance can become a big issue.


Now the difference between doing all of that and using a prebuilt STI solution is that you’re not doing any of these pieces in the middle. In going the buy route, I am going to replace all those pieces of open source that I have put together myself with the STI solution straight out of the box. You simply download the solution and start building your data flows. That is going to massively reduce the amount of time for you to start building your data integrations.

Build vs. Buy - Streaming Data Integration

The Differences Between Build vs. Buy

When it comes to summarizing all of this, what are the differences between the costs and risks of build vs. buy?

Development Costs

We went through all the steps of building a solution. Each one of those steps are going to involve engineers. They’re going to involve people that understand different technologies and there is going to have to be a team around that to manage building the platform. It can be a pretty high development cost and often that development cost can outweigh the cost of licensing a platform that you purchase. 

Time to Value

There’s a long time to market because you have to go through all those steps for each component. If components are upgraded or deprecated, you have to repeat all of those steps. So the time it takes from you saying, “I want to move data from my on-premise databases into my cloud data warehouse” or “I want to move data from IOT on premise into storage so I can do machine learning”. The time it takes for you to go from thinking “I want to do that to being able to do that” is as long as it takes you to build your framework and infrastructure, build your own SDI platform.

Whereas if you’re buying it, the time it takes is how long does it take to download it, which is much faster. The time to a solution is massively reduced because you’re not going through all the steps it takes to build it and during that time not focusing on building business value.

A lot of organizations out there are not software development companies, they are companies with a different purpose. They are a bank, or a finance organization, or a healthcare organization. Value to the organization is massively increased if you’re not spending huge amounts of time building something that someone else has already built. Focusing on business value enables you to build out solutions quickly and bring value to your business much more quickly – utilizing a prebuilt solution rather than building it yourself.

Build vs. Buy - Streaming Data Integration


I’ve mentioned that open source can become obsolescent. There are cases where people invested in open source technologies which people just stopped developing because it was no longer the hot new thing. They now had to try and maintain it themselves and understand someone else’s code in order to maintain the open source that they’d heavily invested in.

If you are buying a streaming data integration solution from a vendor, then that vendor is going to support you. It requires a wide range of specialized skills. Each one of the pieces of open source requires a different skill set in order to understand it and work with it. It may be a different programming language; it may be a different environment that you need to set up. There’s also a meta skill required in understanding how all those components fit together. This is an architectural role that has to understand all the different APIs and different components and work at how to add all the enterprise grade features so that they all scale, are reliable, and all work together. If you’re buying the costs really are the license cost and limited visibility into the code, into the engine that makes it work.

Depending on who you are, that may or may not be a big issue. As long as what you’ve built and the concepts around what you’ve built are transportable then investing in streaming data integration is also transportable.

What are the differences between build vs. buy? It really comes down to, do you want to get going with building your solution and providing business value straight away? Or do you want to focus on being a software development organization building a platform that can then build the applications that will provide business value.

To learn more about build vs. buy for streaming data integration, please visit our Real-time Data Integration solution page, schedule a demo with a Striim expert, or download the Striim platform to get started.

Streaming Data Integration

The Power of Data Mesh, Data Fabric, and Striim

Organizations struggle with managing vast amounts of complex data. Traditional methods often fall short, leading to data silos and security vulnerabilities. Innovations like Data Mesh