Skip to Content
Learning and Operating Presto
book

Learning and Operating Presto

by Angelica Lo Duca, Tim Meehan, Vivek Bharathan, Ying Su
September 2023
Intermediate to advanced
191 pages
4h 32m
English
O'Reilly Media, Inc.
Content preview from Learning and Operating Presto

Chapter 1. Introduction to Presto

Over the last few years, the increasing availability of different data produced by users and machines has raised new challenges for organizations wanting to make sense of their data to make better decisions. Becoming a data-driven organization is crucial in finding insights, driving change, and paving the way to new opportunities. While it requires significant data, the benefits are worth the effort.

This large amount of data is available in different formats, provided by different data sources, and searchable with different query languages. In addition, when searching for valuable insights, users need results very quickly, thus requiring high-performance query engine systems. These challenges caused companies such as Facebook (now Meta), Airbnb, Uber, and Netflix to rethink how they manage data. They have progressively moved from the old paradigm based on data warehouses to data lakehouses. While a data warehouse manages structured and historical data, a data lakehouse can also manage and get insights from unstructured and real-time data.

Presto is a possible solution to the previous challenges. Presto is a distributed SQL query engine, created and used by Facebook at scale. You can easily integrate Presto in your data lake to build fast-running SQL queries that interact with data wherever your data is physically located, regardless of its original format.

This chapter will introduce you to the concept of the data lake and how it differs from ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Ten Things to Know About ModelOps

Ten Things to Know About ModelOps

Thomas Hill, Mark Palmer, Larry Derany
What Employees Want Most in Uncertain Times

What Employees Want Most in Uncertain Times

Kristine W. Powers, Jessica B.B. Diaz
Data Superstream: Data Lakes and Warehouses

Data Superstream: Data Lakes and Warehouses

Alistair Croll, Lena Hall, Vini Jaiswal, Einat Orr, Wannes Rosiers, Jessica Larson, Ryan Blue, Tejas Chopra

Publisher Resources

ISBN: 9781098141844Errata PageSupplemental Content