5Streaming Data Processing for IoT

Carolina Fortuna and Timotej Gale

Jožef Stefan Institute, Ljubljana, Slovenia

5.1 Introduction

We live in a time when data are generated at a greater pace than can be consumed by humans and information is disseminated at a faster speed than ever before. Everyone with access to a connected device can create and consume content: a news article, an image, a tweet, a video, and so on. Furthermore, the number of connected sensors generating data continues to increase rapidly, as we discussed in Chapter 1. Various systems aimed at helping humans summarize, organize, and easily search through these ever‐increasing amounts of data have been developed over the past three decades or so. Perhaps the most widely used are web search engines [1], which help find content in a vast network such as the Web. Most such large data organization systems perform batch processing by ingesting raw data in batches and extracting a model from the batch. Batches can vary in size and in the time period covered. For instance, a new batch can come in hourly, daily, and even monthly.

There are application areas where delivering information at speed is important and processing large data in batches is not sufficient. Systems wherein data are delivered in real‐time or near real‐time and processed as it arrives are referred to as stream processing systems. An excellent example of such systems is the financial trading system, where information access to news and stock prices and ...

Get The Internet of Things now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.