Chapter 7. The World Is Going Streaming
You could not step twice into the same river. Everything flows and nothing stays.
The need for asynchronous message-passing not only includes responding to individual messages or requests, but also to continuous streams of data, potentially unbounded streams. Over the past few years, the streaming landscape has exploded in terms of both products and definitions of what streaming really means.1
There’s no clear boundary between processing of messages that are handled individually and data records that are processed en masse. Messages have an individual identity and each one requires custom processing, whereas we can think of records as anonymous by the infrastructure and processed as a group. However, at very large volumes, it’s possible to process messages using streaming techniques, whereas at low volumes, records can be processed individually. Hence, the characteristics of data records versus messages is an orthogonal concern to how they are processed.
The fundamental shift is that we’ve moved from “data at rest” to “data in motion.” The data used to be offline and now it’s online. Applications today need to react to changes in data in close to real time—when it happens—to perform continuous queries or aggregations of inbound data and feed it—in real time—back into the application to affect the way it is operating.
Three Waves Toward Fast Data
The first wave of big data was “data at rest.” We stored massive amounts in Hadoop ...