Skip to Content
Data Science on the Google Cloud Platform, 2nd Edition
book

Data Science on the Google Cloud Platform, 2nd Edition

by Valliappa Lakshmanan
March 2022
Beginner to intermediate
459 pages
12h 19m
English
O'Reilly Media, Inc.
Book available
Content preview from Data Science on the Google Cloud Platform, 2nd Edition

Chapter 11. Time-Windowed Features for Real-Time Machine Learning

In Chapter 8, we briefly explored incorporating time-windowed features, such as the moving average of taxi-out delay at the originating airport, as an input to the model. We found that the time-windowed features reduced the model error. However, it was unclear how clients (who know only about the flight they are on) would be able to provide the correct value. Because of that, we decided to drop the time-windowed features. In this chapter, we will address that shortcoming by implementing a real-time, streaming machine learning pipeline that uses Cloud Dataflow and Vertex AI.

All of the code snippets in this chapter are available in the folder 11_realtime of the GitHub repository. See the README.md file in that directory for instructions on how to do the steps described in this chapter.

Time Averages

What time-windowed aggregate features did we want to use, but couldn’t? Flight arrival times are scheduled based on the average taxi-out time at the departure airport at that specific hour. The machine learning model will learn this average quite easily because we are showing the entire dataset and telling the ML model the name of the origin airport. For example, at peak hours in New York’s JFK airport, taxi-out times on the order of an hour are quite common, so airlines take that into account when publishing their flight schedules. It is only when the taxi-out time exceeds the average that we ought to be worried. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Engineering with Google Cloud Platform

Data Engineering with Google Cloud Platform

Adi Wijaya
Visualizing Google Cloud

Visualizing Google Cloud

Priyanka Vergadia

Publisher Resources

ISBN: 9781098118945Errata PageSupplemental Content