So far in this book, we have worked mostly with cross-sectional data, in which we have observations for entities at a single point in time. This includes the credit card dataset with transactions that happened over two days and the MNIST dataset with images of digits. For these datasets, we applied unsupervised learning to learn the underlying structure in the data and to group similar transactions and images together without using any labels.
Unsupervised learning is also very valuable for work with time series data, in which we have observations for a single entity at different time intervals. We need to develop a solution that can learn the underlying structure of data across time, not just for a particular moment in time. If we develop such a solution, we can identify similar time series patterns and group them together.
This is very impactful in fields such as finance, medicine, robotics, astronomy, biology, meteorology, etc., since professionals in these fields spend a lot of time analyzing data to classify current events based on how similar they are to past events. By grouping current events together with similar past events, these professionals are able to more confidently decide on the right course of action to take.
In this chapter, we will work on clustering time series data based on pattern similarity. Clustering time series data is a purely unsupervised approach and does not require annotation of data for training, although annotated ...