Chapter 2: Discovering Storage and Compute Data Lakes
In the previous chapter, we discussed the immense power that data possesses, but with immense power comes increased responsibility. In the past, the key focus of organizations has been to detect trends with data, with the goal of revenue acceleration. Very commonly, however, they have paid less attention to vulnerabilities caused by inconsistent data management and delivery.
In this chapter, we will discuss some ways a data lake can effectively deal with the ever-growing demands of the analytical world.
In this chapter, we will cover the following topics:
- Introducing data lakes
- Segregating storage and compute in a data lake
- Discovering data lake architectures
Introducing data lakes
Get Data Engineering with Apache Spark, Delta Lake, and Lakehouse now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.