Book description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you’ll work through a sample business decision by employing a variety of data science approaches.
Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.
You’ll learn how to:
- Automate and schedule data ingest, using an App Engine application
- Create and populate a dashboard in Google Data Studio
- Build a real-time analysis pipeline to carry out streaming analytics
- Conduct interactive data exploration with Google BigQuery
- Create a Bayesian model on a Cloud Dataproc cluster
- Build a logistic regression machine-learning model with Spark
- Compute time-aggregate features with a Cloud Dataflow pipeline
- Create a high-performing prediction model with TensorFlow
- Use your deployed model as a microservice you can access from both batch and real-time pipelines
Publisher resources
Table of contents
- Preface
- 1. Making Better Decisions Based on Data
- 2. Ingesting Data into the Cloud
-
3. Creating Compelling Dashboards
- Explain Your Model with Dashboards
- Why Build a Dashboard First?
- Accuracy, Honesty, and Good Design
- Loading Data into Google Cloud SQL
- Create a Google Cloud SQL Instance
- Interacting with Google Cloud Platform
- Controlling Access to MySQL
- Create Tables
- Populating Tables
- Building Our First Model
- Building a Dashboard
- Getting Started with Data Studio
- Summary
- 4. Streaming Data: Publication and Ingest
- 5. Interactive Data Exploration
- 6. Bayes Classifier on Cloud Dataproc
- 7. Machine Learning: Logistic Regression in Spark and BigQuery
- 8. Time-Windowed Aggregate Features
- 9. Machine Learning Classifier Using TensorFlow
- 10. Real-Time Machine Learning
- A. Considerations for Sensitive Data within Machine Learning Datasets
- Index
Product information
- Title: Data Science on the Google Cloud Platform
- Author(s):
- Release date: January 2018
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491974568
You might also like
book
Official Google Cloud Certified Associate Cloud Engineer Study Guide
The Only Official Google Cloud Study Guide The Official Google Cloud Certified Associate Cloud Engineer Study …
book
Deep Learning Illustrated: A Visual, Interactive Guide to Artificial Intelligence
“The authors’ clear visual style provides a comprehensive look at what’s currently possible with artificial neural …
book
Deep Learning for Coders with fastai and PyTorch
Deep learning is often viewed as the exclusive domain of math PhDs and big tech companies. …
book
Official Google Cloud Certified Professional Cloud Architect Study Guide
Sybex's proven Study Guide format teaches Google Cloud Architect job skills and prepares you for this …