Book description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP.
Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way.
You'll learn how to:
- Employ best practices in building highly scalable data and ML pipelines on Google Cloud
- Automate and schedule data ingest using Cloud Run
- Create and populate a dashboard in Data Studio
- Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
- Conduct interactive data exploration with BigQuery
- Create a Bayesian model with Spark on Cloud Dataproc
- Forecast time series and do anomaly detection with BigQuery ML
- Aggregate within time windows with Dataflow
- Train explainable machine learning models with Vertex AI
- Operationalize ML with Vertex AI Pipelines
Table of contents
- Preface
- 1. Making Better Decisions Based on Data
- 2. Ingesting Data into the Cloud
- 3. Creating Compelling Dashboards
- 4. Streaming Data: Publication and Ingest with Pub/Sub and Dataflow
- 5. Interactive Data Exploration with Vertex AI Workbench
- 6. Bayesian Classifier with Apache Spark on Cloud Dataproc
- 7. Logistic Regression Using Spark ML
- 8. Machine Learning with BigQuery ML
- 9. Machine Learning with TensorFlow in Vertex AI
- 10. Getting Ready for MLOps with Vertex AI
- 11. Time-Windowed Features for Real-Time Machine Learning
- 12. The Full Dataset
- Conclusion
- A. Considerations for Sensitive Data Within Machine Learning Datasets
- Index
- About the Author
Product information
- Title: Data Science on the Google Cloud Platform, 2nd Edition
- Author(s):
- Release date: March 2022
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098118952
You might also like
book
Radar Trends to Watch: September 2023
Read about the latest developments on O'Reilly Media's Radar.
book
Analytical Skills for AI and Data Science
While several market-leading companies have successfully transformed their business models by following data- and AI-driven paths, …
book
The Self-Service Data Roadmap
Data-driven insights are a key competitive advantage for any industry today, but deriving insights from raw …
book
Mastering Financial Pattern Recognition
Candlesticks have become a key component of platforms and charting programs for financial trading. With these …