Skip to content
O'Reilly home
Amazon Web Services (AWS)

Getting Started with Amazon Athena

enter image description here

Using SQL to query distributed, big data

This event has ended.

What you’ll learn and how you can apply it

By the end of this live online course, you’ll understand:

  • The difference between a traditional RDBMS, cloud native databases, and Athena
  • How data is stored for Athena
  • How to connect Athena to your data
  • How to use the Athena console to query your data

And you’ll be able to:

  • Query data in CSV, Parquet, and JSON files stored in S3 from Athena
  • Use the Athena console to query data
  • Use the AWS SDK in a Python application to run queries on Athena remotely

This course is for you because…

  • You’re a business intelligence (BI) professional working with big data (specifically, running ad hoc queries on your data).
  • You’re a programmer writing services that have to run remote queries on Athena.
  • You already work with SQL but are hesitant to learn proprietary query languages that aren’t standardized.
  • You’re interested in working with big data and analytics.

Prerequisites

  • A working knowledge of SQL and Python
  • An AWS account

Recommended preparation:

Schedule

The timeframes are only estimates and may vary according to how the class is progressing.

Introduction (15 minutes)

  • Presentation: Overview of the data used in the example; overview of the Python project; demo of the finished product

Creating a table in Athena (40 minutes)

  • Presentation: Getting the sample dataset; understanding the sample dataset; creating a bucket in S3 to host the sample dataset; creating a table in Athena; declaring the columns required in Athena
  • Jupyter Notebook exercise: Create a table in Athena
  • Q&A

Break (5 minutes)

Querying data in Athena (40 minutes)

  • Presentation: Starting with simple SELECT queries; queries with WHERE clauses; aggregation queries; how query results are stored
  • Jupyter Notebook exercise: Run sample queries in Athena
  • Q&A

How to use the Jupyter Notebook (15 minutes)

  • Demo and Jupyter Notebook exercise: Explore the Jupyter Notebook
  • Q&A

Break (5 minutes)

The final Python project (50 minutes)

  • Presentation: Configuring the AWS Python SDK; writing the service layer required for Athena; rewriting the sample queries used earlier in the demo; executing these queries remotely and getting results
  • Jupyter Notebook exercise: Write a Python service

Wrap-up and Q&A (10 minutes)

Your Instructor

  • Sunny Srinidhi

    Sunny Srinidhi is a Senior Software Engineer at Lowe’s India and has been working in the data space for over seven years. He writes microservices to work with data at scale and has experience using a variety of databases, including Oracle, MySQL, MongoDB, and Apache Hbase. A frequent blogger on Medium and his own personal blog, Sunny is always interested in learning about and exploring the next exciting data-related tool.

Start your free 10-day trial

Get started

Want to learn more at events like these?

Get full access to O'Reilly online learning for 10 days—free.

  • checkmark50k+ videos, live online training, learning paths, books, and more.
  • checkmarkBuild playlists of content to share with friends and colleagues.
  • checkmarkLearn anywhere with our iOS and Android apps.
Start Free TrialNo credit card required.