Chapter 10. Creating ML Models to Predict Sequences
Chapter 9 introduced sequence data and the attributes of a time series, including seasonality, trend, autocorrelation, and noise. You created a synthetic series to use for predictions and explored how to do basic statistical forecasting. Over the next couple of chapters, you’ll learn how to use ML for forecasting. But before you start creating models, you need to understand how to structure the time series data for training predictive models, by creating what we’ll call a windowed dataset.
To understand why you need to do this, consider the time series you created in Chapter 9. You can see a plot of it in Figure 10-1.
Figure 10-1. Synthetic time series
If at any point you want to predict a value at time t, you’ll want to predict it as a function of the values preceding time t. For example, say you want to predict the value of the time series at time step 1,200 as a function of the 30 values preceding it. In this case, the values from time steps 1,170 to 1,199 would determine the value at time step 1,200, as shown in Figure 10-2.
Figure 10-2. Previous values impacting prediction
Now this begins to look familiar: you can consider the values from 1,170–1,199 to be your features and the value at 1,200 to be your label. If you can ...