Presto First Steps
SQL at any scale
Open source distributed SQL query engine Presto allows you to efficiently query data against disparate sources of all sizes. It unlocks access to all these sources with standard SQL queries and therefore supports a large choice of established business intelligence and reporting tools. Presto can query all data sources at the same time and process queries for analytical needs with unparalleled query performance. Initially developed by Facebook, open source Presto is now used by a wide range of companies, including Netflix and Airbnb.
Join expert Manfred Moser for a hands-on introductory overview of Presto. Through concise demonstrations and interactive exercises, you’ll learn how to run a Presto cluster for production usage, how to configure catalogs to connect data sources, and how to use various tools to query data. You’ll leave with everything you need to get started with this powerful query engine.
What you'll learn-and how you can apply it
By the end of this live online course, you’ll understand:
- The underpinnings that make Presto a scalable, high-performance query engine
- How to connect numerous, largely different data sources
- How to use SQL to query data in any data source directly, even simultaneously
- How to query and visualize with tools to gain better insights
- Resources to help you continue learning
And you’ll be able to:
- Install and run Presto for testing and a Presto cluster for production usage
- Configure catalogs to connect data sources
- Run SQL queries with the Presto CLI
- Set up other SQL tools with a JDBC connection
- Discover what connectors are available
This training course is for you because...
- You’re a data analyst trying to work with one or more data storage systems.
- You work with numerous databases and need to query them at scale.
- You don't want to use different query languages and tools for different data sources.
- You want to become an efficient data analyst and enable other analysts.
- A working knowledge of the command line on a Linux or macOS system
- A basic understanding of databases and SQL
- Read “Introducing Presto” (chapter 1 in Presto: The Definitive Guide)
About your instructor
Manfred Moser is an experienced software developer, writer, and trainer with a long history in the Java and Android open source communities. He’s created dozens of training courses and taught tens of thousands of students online and in person at conferences such as JavaOne, OSCON, AnDevCon, and others. He’s the coauthor of Presto: The Definitive Guide and a major contributor to Presto’s documentation.
The timeframes are only estimates and may vary according to how the class is progressing
What is Presto? (15 minutes)
- Presentation: RDBMS concepts like schema, table, column, row/record, data type; SQL; query engine; massively parallel processing; data sources = catalog; no data storage in Presto itself
Installing and running Presto for the first time (25 minutes)
- Presentation and Katacoda interactive exercises: Download tar.gz; check prerequisites; configure system; start up; configure catalogs; start up again
Break (5 minutes)
Connecting to Presto and running queries (30 minutes)
- Presentation: Overview of connecting and available clients
- Katacoda interactive exercises: Install and use CLI; install and use DBeaver with JDBC driver; run simple queries in both, show catalogs, show schemas, and perform a simple query
Diving deeper into configuring and installing Presto (25 minutes)
- Presentation: Config files; logging; catalog files; custer concept with coordinator and worker
- Katacoda interactive exercises: Configure and start cluster; web UI inspection and cluster stats
Break (5 minutes)
Catalogs, connectors, and data sources (30 minutes)
- Presentation: Connector overview; connector configuration
- Katacoda interactive exercises: Add catalogs using different connectors (simple example memory, PostgreSQL, Hive connector query object storage system, etc.)
SQL support in Presto (40 minutes)
- Presentation and Katacoda interactive exercises: Use catalogs; use CLI and DBeaver; write different SQL queries; learn data types, information schema, and so on; add usage of functions and operators; write a federated query
Wrap-up and Q&A (5 minutes)