Skip to Main Content
The Cloud Data Lake
book

The Cloud Data Lake

by Rukmani Gopalan
December 2022
Beginner to intermediate content levelBeginner to intermediate
244 pages
7h
English
O'Reilly Media, Inc.
Book available
Content preview from The Cloud Data Lake

Chapter 4. Scalable Data Lakes

If you change the way you look at things, the things you look at change.

Wayne Dyer

After reading the first three chapters, you should have all you need to get your data lake architecture up and running on the cloud, at a reasonable cost profile for your organization. Theoretically, you also have the first set of use cases and scenarios successfully running in production. Your data lake is so successful that the demand for more scenarios is now higher, and you are busy serving the needs of your new customers. Your business is booming, and your data estate is growing rapidly. As they say in business, going from zero to one is a different challenge than going from one to one hundred or from one hundred to one thousand. To ensure your design is also scalable and continues to perform as your data and the use cases grow, it’s important to realize the various factors that affect the scale and performance of your data lake. Contrary to popular opinion, scale and performance are not always a trade-off with costs, but they very much go hand in hand. In this chapter, we will take a closer look at these considerations as well as strategies to optimize your data lake for scale while continuing to optimize for costs. Once again, we will be using Klodars Corporation, a fictitious organization, to illustrate our strategies. We will build on these fundamentals to focus on performance in Chapter 5.

A Sneak Peek into Scalability

Scale and performance are terms ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

The Enterprise Big Data Lake

The Enterprise Big Data Lake

Alex Gorelik
Designing Cloud Data Platforms

Designing Cloud Data Platforms

Lynda Partner, Danil Zburivsky

Publisher Resources

ISBN: 9781098116576Errata Page