Skip to Content
Learning and Operating Presto
book

Learning and Operating Presto

by Angelica Lo Duca, Tim Meehan, Vivek Bharathan, Ying Su
September 2023
Intermediate to advanced
191 pages
4h 32m
English
O'Reilly Media, Inc.
Content preview from Learning and Operating Presto

Chapter 5. Open Data Lakehouse Analytics

So far, you have learned how to connect Presto to a data lake using standard connectors such as MySQL and Pinot. In addition, you have learned how to write a custom connector using Presto’s Java classes and methods. Finally, you have connected a client to Presto to run generic or custom queries. Now it’s time to use Presto in an advanced, more realistic scenario that addresses the main challenges of big data management: table lookup, concurrent access to data, and access control.

In this chapter, we will give an overview of the data lakehouse and implement a practical scenario. The chapter is divided into two parts. In the first part, we introduce the architecture of a data lakehouse, focusing on its main components. In the second part of the chapter, you will implement a practical data lakehouse scenario using Presto and completely open components.

The Emergence of the Lakehouse

The first generation of data lakes, based primarily on the Hadoop Distributed File System (HDFS), demonstrated the promise of analytics at scale. As a result, many organizations formed data platform architectures consisting of data lakes and data warehouses, stitching pipelines and workflows between them. However, the resulting platform was very complex, with issues around reliability, data freshness, and cost.1

To overcome these issues, organizations tried to stretch both the data lake and the data warehouse in terms of the workloads they could support, but with ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Learning Presto DB

Learning Presto DB

Matt Fuller
Rust in Motion

Rust in Motion

Carol Nichols, Jake Goulding
The Book of Dash

The Book of Dash

Adam Schroeder, Christian Mayer, Ann Marie Ward
Flow Architectures

Flow Architectures

James Urquhart

Publisher Resources

ISBN: 9781098141844Errata Page