GCP: Complete Google Data Engineer and Cloud Architect Guide

Video description

Google Cloud Platform (GCP) is not only the most popular cloud offering currently, but it is possibly the best cloud offering for high-end machine learning applications thanks to TensorFlow, the popular deep learning technology from Google. This course will help you to learn the essential concepts that are needed to deploy TensorFlow applications on GCP.

The course starts with an introduction to cloud computing, Hadoop, and GCP and helps you in setting up the lab for exercises. You’ll understand various compute options, such as Google Compute Engine (GCE), and explore different storage options. As you advance, you’ll work with Cloud SQL and get an overview of BigTable and BigQuery by performing lab exercises, explore the data flow feature called Apache Beam, and use Data Proc for managing Hadoop. You’ll also learn how to use Pub/Sub on GCP and explore the features of a data lab. The course will then take you through machine learning and TensorFlow concepts and show you how to prepare a dataset to run a model and how to work with virtual machines and images. Finally, you will become familiar with networking and security concepts and get to grips with the basics of Hadoop.

By the end of this course, you’ll have developed the skills required to build TensorFlow and machine learning models on GCP.

What You Will Learn

  • Explore various compute options, such as App Engine and Container Engine
  • Discover how neural networks are trained
  • Work with managed instance groups and balance loads
  • Understand cloud computing security concepts
  • Find out the functionality of interconnecting networks
  • Get an overview of the basics of Hadoop


Whether you are looking to pass the Google Data Engineer or Cloud Architect Certification exams, want to build TensorFlow models and deploy them on the cloud, or want to use Google Cloud Platform in your organization, this course is for you. A basic understanding of Hadoop technology is all you need to get started with this course.

About The Author

Janani Ravi: Janani Ravi is a certified Google Cloud Architect and Data Engineer. She has earned her master's degree in electrical engineering from Stanford. She is currently in Loonycorn, a technical video content studio, of which she is a cofounder. Prior to co-founding Loonycorn, she worked at various leading companies, such as Google and Microsoft, for several years as a software engineer.

Publisher resources

Download Example Code

Table of contents

  1. Chapter 1 : You, This Course, and Us
    1. You, This Course and Us
  2. Chapter 2 : Introduction
    1. Theory, Practice and Tests
    2. Why Cloud?
    3. Hadoop and Distributed Computing
    4. On-premise, Colocation or Cloud?
    5. Introducing the Google Cloud Platform
    6. Lab: Setting Up A GCP Account
    7. Lab: Using The Cloud Shell
  3. Chapter 3 : Compute Choices
    1. Compute Options
    2. Google Compute Engine (GCE)
    3. More GCE
    4. Lab: Creating a VM Instance
    5. Lab: Editing a VM Instance
    6. Lab: Creating a VM Instance Using The Command Line
    7. Lab: Creating And Attaching A Persistent Disk
    8. Google Container Engine - Kubernetes (GKE)
    9. More GKE
    10. Lab: Creating A Kubernetes Cluster And Deploying a Wordpress Container
    11. App Engine
    12. Contrasting App Engine, Compute Engine, and Container Engine
    13. Lab: Deploy and Run an App Engine App
  4. Chapter 4 : Storage
    1. Storage Options
    2. Quick Take
    3. Cloud Storage
    4. Lab: Working With Cloud Storage Buckets
    5. Lab: Bucket and Object Permissions
    6. Lab: Life-Cycle Management on Buckets
    7. Lab: Running a Program on a VM Instance and Storing Results on Cloud Storage
    8. Transfer Service
    9. Lab: Migrating Data Using the Transfer Service
    10. Lab: Cloud Storage ACLs and API access with Service Account
    11. Lab: Cloud Storage Customer-Supplied Encryption Keys and Life-Cycle Management
    12. Lab: Cloud Storage Versioning, Directory Sync
  5. Chapter 5 : Cloud SQL, Cloud Spanner ~ OLTP ~ RDBMS
    1. Cloud SQL
    2. Lab: Creating A Cloud SQL Instance
    3. Lab: Running Commands On Cloud SQL Instance
    4. Lab: Bulk Loading Data Into Cloud SQL Tables
    5. Cloud Spanner
    6. More Cloud Spanner
    7. Lab: Working with Cloud Spanner
  6. Chapter 6 : BigTable ~ HBase = Columnar Store.
    1. BigTable Intro
    2. Columnar Store
    3. Denormalised
    4. Column Families
    5. BigTable Performance
    6. Lab: BigTable demo
  7. Chapter 7 : Datastore ~ Document Database
    1. Datastore
    2. Lab: Datastore demo
  8. Chapter 8 : BigQuery ~ Hive ~ OLAP
    1. BigQuery Intro
    2. BigQuery Advanced
    3. Lab: Loading CSV Data into Big Query
    4. Lab: Running Queries On Big Query
    5. Lab: Loading JSON Data With Nested Tables
    6. Lab: Public Datasets in Big Query
    7. Lab: Using Big Query Via The Command Line
    8. Lab: Aggregations And Conditionals In Aggregations
    9. Lab: Subqueries And Joins
    10. Lab: Regular Expressions In Legacy SQL
    11. Lab: Using The With Statement For SubQueries
  9. Chapter 9 : Dataflow ~ Apache Beam
    1. Dataflow Intro
    2. Apache Beam
    3. Lab: Running a Python Dataflow Program
    4. Lab: Running a Java Dataflow Program
    5. Lab: Implementing Word Count In Dataflow Java
    6. Lab: Executing The Word Count Dataflow
    7. Lab: Executing MapReduce In Dataflow In Python
    8. Lab: Executing MapReduce In Dataflow In Java
    9. Lab: Dataflow with BigQuery as Source and Side Inputs
    10. Lab: Dataflow with Big Query as Source and Side Inputs 2
  10. Chapter 10 : Dataproc ~ Managed Hadoop
    1. Data Proc
    2. Lab: Creating And Managing A Dataproc Cluster
    3. Lab: Creating A Firewall Rule To Access Dataproc
    4. Lab: Running A PySpark Job on Dataproc
    5. Lab: Running the PySpark REPL Shell And Pig Scripts On Dataproc
    6. Lab: Submitting a Spark Jar to Dataproc
    7. Lab: Working with Dataproc using the GCloud CLI
  11. Chapter 11 : Pub/Sub for Streaming.
    1. Pub/Sub
    2. Lab: Working with Pub/Sub on the Command Line
    3. Lab: Working with Pub/Sub Using the Web Console
    4. Lab: Setting Up a Pub/Sub Publisher Using the Python Library
    5. Lab: Setting Up a Pub/Sub Subscriber Using the Python Library
    6. Lab: Publishing Streaming Data into Pub/Sub
    7. Lab: Reading Streaming Data from Pub/Sub and Writing to BigQuery
    8. Lab: Executing a Pipeline to Read Streaming Data and Write to BigQuery
    9. Lab: Pub/Sub Source BigQuery Sink
  12. Chapter 12 : Datalab ~ Jupyter
    1. Datalab
    2. Lab: Creating and Working on a Datalab Instance
    3. Lab: Importing and Exporting Data using Datalab
    4. Lab: Using The Charting API In Datalab
  13. Chapter 13 : TensorFlow and Machine Learning
    1. Introducing Machine Learning
    2. Representation Learning
    3. Neural Networks (NN) Introduced
    4. Introducing TF
    5. Lab: Simple Math Operations
    6. Computation Graph
    7. Tensors
    8. Lab: Tensors
    9. Linear Regression Intro
    10. Placeholders and Variables
    11. Lab: Placeholders
    12. Lab: Variables
    13. Lab: Linear Regression with Made-up Data
    14. Image Processing
    15. Images As Tensors
    16. Lab: Reading and Working with Images
    17. Lab: Image Transformations
    18. Introducing MNIST
    19. K-Nearest Neighbours as Unsupervised Learning
    20. One-hot Notation and L1 Distance
    21. Steps in the K-Nearest-Neighbours Implementation
    22. Lab: K-Nearest-Neighbours
    23. Learning Algorithm
    24. Individual Neuron
    25. Learning Regression
    26. Learning XOR
    27. XOR Trained
  14. Chapter 14 : Regression in TensorFlow
    1. Lab: Access Data from Yahoo Finance
    2. Non-TensorFlow Regression
    3. Lab: Linear Regression - Setting Up a Baseline
    4. Gradient Descent
    5. Lab: Linear Regression
    6. Lab: Multiple Regression in TensorFlow
    7. Logistic Regression Introduced
    8. Linear Classification
    9. Lab: Logistic Regression - Setting Up a Baseline
    10. Logit
    11. Softmax
    12. Argmax
    13. Lab: Logistic Regression
    14. Estimators
    15. Lab: Linear Regression using Estimators
    16. Lab: Logistic Regression using Estimators
  15. Chapter 15 : Vision, Translate, NLP and Speech: Trained ML APIs
    1. Lab: Taxicab Prediction - Setting up the dataset
    2. Lab: Taxicab Prediction - Training and Running the model
    3. Lab: The Vision, Translate, NLP, and Speech API
    4. Lab: The Vision API for Label and Landmark Detection
  16. Chapter 16 : Virtual Machines and Images
    1. Live Migration
    2. Machine Types and Billing
    3. Sustained Use and Committed Use Discounts
    4. Rightsizing Recommendations
    5. RAM Disk
    6. Images
    7. Startup Scripts and Baked Images
  17. Chapter 17 : VPCs and Interconnecting Networks
    1. VPCs and Subnets
    2. Global VPCs, Regional Subnets
    3. IP Addresses
    4. Lab: Working with Static IP Addresses
    5. Routes
    6. Firewall Rules
    7. Lab: Working with Firewalls
    8. Lab: Working with Auto Mode and Custom Mode Networks
    9. Lab: Bastion Host
    10. Cloud VPN
    11. Lab: Working with Cloud VPN
    12. Cloud Router
    13. This video explains the cloud router.
    14. Dedicated Interconnect Direct and Carrier Peering
    15. Shared VPCs
    16. Lab: Shared VPCs
    17. VPC Network Peering
    18. Lab: VPC Peering
    19. Cloud DNS and Legacy Networks
  18. Chapter 18 : Managed Instance Groups and Load Balancing
    1. Managed and Unmanaged Instance Groups
    2. Types of Load Balancing
    3. Overview of HTTP(S) Load Balancing
    4. Forwarding Rules, Target Proxy, and URL Maps
    5. Backend Service and Backends
    6. Load Distribution and Firewall Rules
    7. Lab: HTTP(S) Load Balancing
    8. Lab: Content-Based Load Balancing
    9. SSL Proxy and TCP Proxy Load Balancing
    10. Lab: SSL Proxy Load Balancing
    11. Network Load Balancing
    12. Internal Load Balancing
    13. Autoscalers
    14. Lab: Autoscaling with Managed Instance Groups
  19. Chapter 19 : Ops and Security
    1. Stack driver
    2. StackDriver Logging
    3. Lab: StackDriver Resource Monitoring
    4. Lab: StackDriver Error Reporting and Debugging
    5. Cloud Deployment Manager
    6. Lab: Using Deployment Manager
    7. Lab: Deployment Manager and StackDriver
    8. Cloud Endpoints
    9. Cloud IAM: User accounts, Service accounts, API Credentials
    10. Cloud IAM: Roles, Identity-Aware Proxy, Best Practices
    11. Lab: Cloud IAM
    12. Data Protection
  20. Chapter 20 : Appendix: Hadoop Ecosystem
    1. Introduction to the Hadoop Ecosystem
    2. Hadoop
    3. HDFS
    4. MapReduce
    5. Yarn
    6. Hive
    7. Hive vs. RDBMS
    8. HQL vs. SQL
    9. OLAP in Hive
    10. Windowing Hive
    11. Pig
    12. More Pig
    13. Spark
    14. More Spark
    15. Streams Intro
    16. Microbatches
    17. Window Types

Product information

  • Title: GCP: Complete Google Data Engineer and Cloud Architect Guide
  • Author(s): Janani Ravi
  • Release date: November 2020
  • Publisher(s): Packt Publishing
  • ISBN: 9781788999519