In this podcast episode, I talk about fast data processing and reactive systems with Karl Wehden, director of product management at Lightbend. If this conversation gets you interested in fast data processing and reactive systems, be sure to check out Tom Peck’s live webcast on March 28, 2017.
Wehden believes there are several factors driving the transformation of batch storage systems into streams within data processing. “The storage of data has traditionally been batch in many degrees, or has been very bespoke when it hasn't been,” he says. “I think what's driving the transformation to fast data and streams today is the need to keep up with a continuous explosion of data, but, also, the constant reset of expectations in terms of responsiveness and the reactiveness, if you will, of the systems that people are using. That move has really driven people away from batch and into what we think of as a reactive or fast data pattern.”
The expectations around moving to fast data and streams-based technology is that data will get faster and behave in an asynchronous manner while being processed from an event log or sequence of events. But Wehden thinks adopting streams within your enterprise can be a tricky endeavor without careful consideration of the technology, latency, and programming model. “Take all those three things into account,” he says. “What's your messaging layer look like, what latency are you driving, and how easy is it to work with? Is it appropriate for your development community? Is it appropriate for your operations team? Is it something that makes sense? Focus on ease of use is my suggestion.”
Wehden describes the four main reasons why enterprises would want to invest in a fast data platform:
- Rapid start up times
- High system productivity
- Executing on industry best practices
- Capturing telemetry and metrics for all running components
So, how quickly are enterprises making the move to reactive systems featuring fast data streams? Wehden says what's happening today is that organizations that didn't make a jump to a distributed data storage system such as Hadoop are even more likely to move from batch to streaming technologies because they see an opportunity to “leapfrog” the lessons distributed batch systems taught us and start consuming and processing data in the stream. And for those companies that want to drive a parallel path to these leapfroggers, Wehden says technologies like the Hadoop stack are being used to store streaming data.
For more on fast data processing and reactive systems, attend Tom Peck’s live webcast on March 28, 2017.
This post is a collaboration between O'Reilly and Lightbend. See our statement of editorial independence.