Chapter 17Time Series Analysis
Time series analysis, in my experience, is not as common as you might expect in data science work. However, that seems to be largely an artifact of the datasets that it has been applied to thus far, which tends to be legacy business spreadsheets and dumps of SQL databases. Especially as sensor nets become more ubiquitous, time series will come to play a much larger role in daily work. At that point, data scientists will have a lot of catching up to do, because electrical engineers have been analyzing time series for decades.
Typical applications of time series analysis in data science include the following:
- Predicting when/whether an event will occur, such as a failure of the machine generating the data
- Projecting the value of the time series at future points in time, such as a stock whose price we want to predict
- Identifying interesting patterns in a corpus of time series data that is too large for a human to comb through.
All of these business applications can ultimately be formulated as machine learning problems. For example:
- If we are trying to predict whether a component is at risk of failure, this is a classification problem: we extract various features from the data to date (especially its recent history) and use it to predict a binary variable of whether it will fail soon (say, in the next hour).
- Let's say we want to predict the value of a time series in the future. Well, finding the value of the time series an hour from now based on ...