The IoT is a natural ecosystem for streaming analytics

Machines can respond to data at machine-speed with streaming analytics and stateful services.

By Timothy McGovern
March 16, 2016
Mill Wier. Mill Wier. (source: By Ian Richardson on Flickr)

In this O’Reilly Podcast, O’Reilly’s Ben Lorica talks with Ryan Betts, CTO of VoltDB, about the IoT and streaming data. Their discussion covers unexpected use cases for IoT; building networked things that respond to IoT data (not just feeding data in); and the big picture of data management in the context of many, many networked things.

Manufacturing use cases for IoT

Lorica and Betts discussed a few pioneering use cases for IoT. Betts notes that manufacturing is one area where we’re just beginning to see the payoff from gathering more—and novel—types of data:

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

I was talking to one manufacturer that makes a lot of memory. When they look at their manufacturing process and they think about IoT manufacturing for them, there is actually a lot of signal processing that goes on in the quality assurance process. They want to compare photos off of the assembly line; they have a lot of visual, and even in some cases, audio information that they want to process as an input to their manufacturing process. That is something that is pretty domain specific, but still, they are generating the volume and the velocity of events, and they’ll augment that core platform with the ability to perform the domain-specific analysis that is important to them.

On responding at machine speed

Ad networks have pioneered fast responses, with ad delivery in the tens or hundreds of milliseconds. But that speed can be used to make your devices safer and your systems more resilient. Betts explains:

At [sub-200-millisecond response times], you are starting to see that the human time frame and the machine time frame aren’t so different. It also goes back to what you are going to do with your analytics. If you’re doing analytics in an electrical network, for example, and you can detect that some component is about to explode, you don’t want to put that on the dashboard and wait for a person to react. You want to be able to write an operational application—and respond to that event—in an automated way.

They can say, ‘here’s my defined policy: when a device exceeds this threshold under these circumstances, I need to shut it down, or reroute it. I need to turn off this machine so it can be maintained before a more severe breakage occurs.’ At that point, you don’t want human interaction. The important thing there is to encode that operational action into something that does happen in machine speed. It’s a really interesting question, and I think it is not always clean cut whether you are operating at machine speed or human speed. There are cases where you need a human’s attention, but you might have much less time to get it than you realize, not even full seconds. There are other cases where you simply cannot wait for a human at all. You want to be able to react immediately.

Reference architecture for the IoT

Is there a new, IoT-specific set of best practices for data management? Betts says the practices we have in place already will evolve to suit the IoT:

I would love to say yes, but I think the answer is that they really are just evolving the big data practices that they’ve learned. What we see is really a continuity, and we see a continuity of reference architecture for big data across IoT, across non-IoT-oriented verticals. Really, we’re just seeing those core data management requirements in all of these cases. I think some of the things that are really interesting in that space are trying to understand how do you handle stateful services in a SaaS deployment? What does that mean for a micro-services style architecture? How do you really work with data services in a way that manages, continues deployment? How do you figure out the responsibilities for data at the edge, versus data that should be centralized? There is a lot of architectural thought, and I think best practices remain to be evolved, and sort of codified through our experience, but I think when people look at the data management platform, what the core requirements are, they don’t differ too much from IoT to big financial apps to big data apps, in general.

This post and podcast is part of a collaboration between O’Reilly and VoltDB. See our statement of editorial independence.

Post topics: Data science