8

Forecasting: Breathe Easy; You Can't Win

As you saw in Chapters 3, 6 and 7, supervised machine learning is about predicting a value or classifying an observation using a model trained on past data. Forecasting is similar. Sure, you can forecast without data (astrology, anyone?). But in quantitative forecasting, past data is used to predict a future outcome. Indeed, some of the same techniques, such as multiple regression (introduced in Chapter 6), are used in both disciplines.

But where forecasting and supervised machine learning differ greatly is in their canonical problem spaces. Typical forecasting problems are about taking some data point over time (sales, demand, supply, GDP, carbon emissions, or population, for example) and projecting that data into the future. And in the presence of trends, cycles, and the occasional act of God, the future data can be wildly outside the bounds of the observed past.

And that's the problem with forecasting: unlike in Chapters 6 and 7 where pregnant women more or less keep buying the same stuff, forecasting is used in contexts where the future often looks nothing like the past.

Just when you think you have a good projection for housing demand, the housing bubble bursts and your forecast is in the toilet. Just when you think you have a good demand forecast, a flood disrupts your supply chain, limiting your supply, forcing you to raise prices, and throwing your sales completely out of whack. Future time series data can and will look different ...

Get Data Smart: Using Data Science to Transform Information into Insight now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.