Chapter 8. BigQuery and Data Warehousing

With more enterprises leaning on real-time data and analytics to drive business decisions, data warehousing techniques are becoming more critical. As Google Cloud’s serverless, petabyte-scale data warehouse, BigQuery is often your first and last stop for data storage, large-scale analytics, and even SQL-based machine learning models. As a serverless service, there are no clusters to create. You simply upload your data to BigQuery and start querying.

BigQuery is also very cost effective, since compute and storage are separated and can scale separately. If you never query your data, you are only charged the storage costs. But when you do run queries, you have access to a huge amount of serverless compute to process your data quickly. And you pay only for the compute used when you query, instead of paying for idle workers in a cluster.

The following recipes show examples of implementing data loading, scalable data querying, and streaming in BigQuery. Included are tips and tricks beyond standard SQL skills, some of which are specific to the BigQuery service and implementation. Several recipes will also use the bq command-line tool covered in Chapter 1.

All code samples for this chapter are in this book’s GitHub repository. You can follow along and copy the code for each recipe by going to the folder with that recipe’s number.

8.1 Using Cloud Console to Run a BigQuery Query

Problem

You want to get started with BigQuery quickly.

Solution ...

Get Google Cloud Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.