Chapter 8. Generating and Selecting Features for a Time Series

In the previous two chapters we examined methods of time series analysis that rely on using all the data points in a time series to fit a model. However, in preparation for the next chapter’s discussion of the application of machine learning to time series analysis, in this chapter we will study feature generation and selection for time series. If you are unfamiliar with the concept of feature generation, you will not remain so for long. It’s an intuitive process and one that enables a creative side to data analysis.

Feature generation is the process of finding a quantitative way to encapsulate the most important traits of time series data into just a few numeric values and categorical labels. You are compressing the raw times series data into a shorter representation via a set of features to describe that time series (we’ll work through a quick example momentarily). For example, a very simple feature generation could describe every time series with its mean value and the number of time steps in the series. This would be one way of describing that time series without going through all the raw data step by step.

The purpose of feature generation is to compress as much information about the full time series as possible into a few metrics or, alternately, to use those metrics to identify the most important information about the time series and discard the rest. This is important for machine learning methods, most of which ...

Get Practical Time Series Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.