8Fog Computing

8.1. Introduction

In the data lake concept, all data are ingested, no matter their volume, their variety and their velocity. Storing all these data could be a challenge even if technology offers several solutions in terms of infrastructure and environment such as on premise, on cloud or hybrid cloud. The Internet of Things (IoT) has changed the concept of data acquisition into the data lake environment, and for some data lakes, volume limits could be reached in the near future. An interesting concept, named fog computing, has been recently introduced. One main characteristic of fog computing is the sharing of data ingestion steps between the sensors which produce data, and the data lake which consumes data.

This chapter first explains the concept of fog computing and the associated challenges and then discusses the different options to be taken into account when dealing with a data lake.

8.2. A little bit of context

The cloud is often seen by the end user as an infinite space, where all of their data could be stored, accessed, shared and curated easily. That was indeed the case when the Internet was mainly there to store data produced by humans. Its main features were to store and share data like videos, songs, texts and hypertexts. However, this is changing gradually since numerous objects now also consume bandwidth and cloud storage to manipulate the data they produce. The key change concerns who (or what) is using these data and what for. Traditionally, ...

Get Data Lakes now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.