Skip to Content
View all events

BigQuery on the Google Cloud Platform in 3 Weeks

Published by O'Reilly Media, Inc.

Intermediate content levelIntermediate

A hands-on guide to ingesting, analyzing, and querying data using BigQuery

Course outcomes

  • Understand how to load and manage data in BigQuery
  • Optimize query performance using materialized views, partitioning, clustering
  • Learn how to build ML models in SQL using BigQuery ML

Course description

Join expert Janani Ravi to get a complete overview of working with BigQuery, the serverless, autoscaled, data warehouse on the Google Cloud Platform. No experience with BigQuery is necessary, as you’ll spend a lot of time in hands-on exploration, creating datasets and tables and executing SQL queries to analyze and process your data. You’ll use DataStudio to visualize your data and the bq command-line tool and Python notebooks to access and query data in BigQuery. You’ll create views over your tables and administer BigQuery tables and views using role-based access controls, and you’ll improve the performance of your queries by using partitioned and clustered tables. Finally, you’ll see how BigQuery democratizes machine learning by allowing you to build a variety of models such as regression, classification, clustering, and recommendation systems.

Week 1: Getting Started with BigQuery

Week 2: Configuring and Administering BigQuery

Week 3: Machine Learning with BigQuery

NOTE: With today’s registration, you’ll be signed up for all three weeks. Although you can attend any of the sessions individually, we recommend participating in all three weeks.

What you’ll learn and how you can apply it

  • Demonstrate how to use BigQuery to run queries in the Cloud Console
  • Use the BigQuery command-line tool to create and query tables
  • Secure your BigQuery tables and views
  • Visualize data (including geospatial data in BigQuery)
  • Query data in BigQuery using Python APIs
  • Apply ML techniques on data stored in BigQuery using SQL

This live event is for you because...

You’re a data analyst who wants to get hands-on experience working with BigQuery on the GCP.

  • You’re a systems administrator who wants to move into the data analyst role.
  • You’re working on large datasets in BigQuery and want to learn how to improve the performance of your queries.
  • You have experience with data warehouses on other cloud platforms and want to explore BigQuery features.

Prerequisites

  • Create a free Google Cloud Platform account and enable billing on that account to provision resources as needed (participation in exercises is optional, but these preparations are necessary if you plan to participate)
  • A very basic understanding of cloud platforms such as AWS, Azure, or others
  • A basic understanding of SQL queries and how they work

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Week 1: Getting Started with BigQuery

Introducing the BigQuery data warehouse (30 minutes)

  • Presentation: Role of BigQuery on the GCP; comparing BigQuery with Cloud SQL, Cloud Spanner, and Cloud Bigtable; characteristics (serverless, autoscaled); architectural overview; features and pricing; projects, datasets, tables
  • Group discussion: Compare BigQuery with data warehouses on other cloud platforms
  • Q&A

Working with datasets and tables (30 minutes)

  • Hands-on exercises: Import and use public datasets; load data into tables using files and external sources; schema configuration and schema auto-detection; execute simple, grouping, and aggregation queries and view results; use cached results in queries and run parameterized queries; use STRUCT, ARRAY_AGG, and UNNEST operations; visualize query results using DataStudio (including geospatial data)
  • Q&A
  • Break

Using the bq command-line tool (30 minutes)

  • Hands-on exercises: Use Cloud Shell to work with bq; create and configure datasets and tables; execute queries; load data
  • Q&A

Accessing BigQuery programmatically using Python (30 minutes)

  • Hands-on exercises: Use Python APIs to connect to BigQuery; run jobs using programmatic APIs; access results programmatically
  • Q&A

Week 2: Configuring and Administering BigQuery

Creating and working with views (30 minutes)

  • Presentation: Differences between tables and views
  • Hands-on exercises: Create views and configure view properties; execute queries on views; compare views and materialized views; create and query materialized views and authorized views
  • Group discussion: Choosing between view types; pros and cons of each type of view
  • Q&A
  • Break

Administering datasets, tables, and views (30 minutes)

  • Hands-on exercises: Configuring role-based access to datasets, tables, and views; restricting access to specific rows and columns using derived tables
  • Q&A
  • Break

Partitioning and clustering tables (60 minutes)

  • Presentation: Differences between partitioning, sharding, and clustering; use cases of partitioning, sharding, and clustering
  • Hands-on exercises: Create and query partitioned tables using column-partitioning; create and query clustered tables; compare query performance for partitioned and clustered tables
  • Group discussion: Choosing between partitioning, sharding, and clustering; pros and cons of each
  • Q&A

Week 3: Machine Learning with BigQuery

Regression and classification using BigQuery ML (60 minutes)

  • Presentation: Democratizing machine learning with BigQuery ML; understanding regression and classification; evaluation metrics for regression, classification, and clustering
  • Group discussion: Identify machine learning problem scenarios
  • Hands-on exercises: Create and evaluate a regression model in SQL; create and evaluate a classification model in SQL
  • Q&A
  • Break

Clustering and recommendations using BigQuery ML (60 minutes)

  • Presentation: Understanding clustering and recommendation systems
  • Hands-on exercises: Create and evaluate a clustering model in SQL; create and evaluate a recommendations systems model in SQL; make predictions using trained models
  • Q&A

Your Instructor

  • Janani Ravi

    Janani Ravi is cofounder of Loonycorn, a team dedicated to upskilling IT professionals. She’s been involved in more than 100 online courses in data analytics, feature engineering, and machine learning. Previously, Janani worked at Google, Flipkart, and Microsoft. She completed her studies at Stanford.

Skill covered

Google BigQuery